Gryphe commited on
Commit
ee75bc8
·
verified ·
1 Parent(s): 0a3d53a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Mistral-Small-3.2-24B-Instruct-2506
4
+ tags:
5
+ - instruct
6
+ - finetune
7
+ - chatml
8
+ - axolotl
9
+ - roleplay
10
+ license: apache-2.0
11
+ language:
12
+ - en
13
+ ---
14
+
15
+ ![image/jpg](Codex.jpg)
16
+
17
+ # Codex-24B-Small-3.2
18
+
19
+ **Note: This model does not include vision. It is text-only.**
20
+
21
+ Not counting my AI Dungeon collaboration, it's been a while since I did another personal release that wasn't Pantheon, but here we are! You can consider Codex a research-oriented roleplay experiment in which I've tried to induce as much synthetic diversity as possible. Gone are the typical "Charname/he/she does this" responses and welcome are, well, anything else! You have to try to understand, really.
22
+
23
+ In the datasets themselves are countless other breakthroughs and improvements, but I'd say the most important one is embracing the full human spectrum of diverse storytelling. No matter whether it's wholesome or dark, this model will not judge, and it intends to deliver. (Or tries to, anyway!)
24
+
25
+ Your user feedback is critical to me so don't hesitate to tell me whether my model is either 1. terrible, 2. awesome or 3. somewhere in-between.
26
+
27
+ ## Model details
28
+
29
+ Considering Small 3.2 boasts about repetition reduction, I figured this was the time to train it on the very work I've been focusing on for the past few months - systematic pattern diversity!
30
+
31
+ This finetune combines approximately 39 million tokens of carefully curated data:
32
+
33
+ - GPT 4.1 Instruct core for clean instruction following
34
+ - DeepSeek V3/R1 roleplay data
35
+ - Curated "best of" Pantheon interactions
36
+ - Diverse text adventure compilations
37
+
38
+ Each dataset component was specifically validated for structural variance - rarely starting responses the same way, featuring diverse sentence patterns and 10-40 turn conversations. This builds on months of diversity optimization research aimed at breaking common AI response patterns. It's been...quite a journey.
39
+
40
+ About half of the roleplay dataset is in Markdown asterisk format, but the majority of the other data is written in a narrative (book-style) present tense, second person perspective format.
41
+
42
+ ## Inference
43
+
44
+ Mistral really loves recommending unusual inference settings but I've been getting decent results with the settings below:
45
+
46
+ ```
47
+ "temperature": 0.8,
48
+ "repetition_penalty": 1.05,
49
+ "min_p": 0.05
50
+ ```
51
+
52
+ Having character names in front of messages is not a requirement but remains a personal recommendation of mine - it seems to help the model focus more on the character(s) in question. World-focused text adventures do fine without it.
53
+
54
+ ## Prompt Format
55
+
56
+ The model was trained using ChatML.
57
+
58
+ ```
59
+ <|im_start|>system
60
+ SYSTEM MESSAGE GOES HERE<|im_end|>
61
+ <|im_start|>user
62
+ USER MESSAGE GOES HERE<|im_end|>
63
+ <|im_start|>assistant
64
+ Character:
65
+ ```
66
+
67
+ ## Credits
68
+
69
+ - Everyone from [Anthracite](https://huggingface.co/anthracite-org)! Hi, guys!
70
+ - [Latitude](https://huggingface.co/LatitudeGames), who decided to take me on as a finetuner and gave me the chance to accumulate even more experience in this fascinating field
71
+ - All the folks I chat with on a daily basis on Discord! You know who you are.
72
+ - Anyone I forgot to mention, just in case!