lucyknada commited on
Commit
e646a8e
·
verified ·
1 Parent(s): b728ac1

Upload ./README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +79 -0
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Mistral-Small-3.2-24B-Instruct-2506
4
+ tags:
5
+ - instruct
6
+ - finetune
7
+ - chatml
8
+ - axolotl
9
+ - roleplay
10
+ license: apache-2.0
11
+ language:
12
+ - en
13
+ ---
14
+ ### exl3 quant
15
+ ---
16
+ ### check revisions for quants
17
+ ---
18
+
19
+
20
+ ![image/jpg](Codex.jpg)
21
+
22
+ # Codex-24B-Small-3.2
23
+
24
+ **Note: This model does not include vision. It is text-only.**
25
+
26
+ Not counting my AI Dungeon collaboration, it's been a while since I did another personal release that wasn't Pantheon, but here we are! You can consider Codex a research-oriented roleplay experiment in which I've tried to induce as much synthetic diversity as possible. Gone are the typical "Charname/he/she does this" responses and welcome are, well, anything else! You have to try to understand, really.
27
+
28
+ In the datasets themselves are countless other breakthroughs and improvements, but I'd say the most important one is embracing the full human spectrum of diverse storytelling. No matter whether it's wholesome or dark, this model will not judge, and it intends to deliver. (Or tries to, anyway!)
29
+
30
+ GGUF quants [are available here](https://huggingface.co/bartowski/Gryphe_Codex-24B-Small-3.2-GGUF).
31
+
32
+ Your user feedback is critical to me so don't hesitate to tell me whether my model is either 1. terrible, 2. awesome or 3. somewhere in-between.
33
+
34
+ ## Model details
35
+
36
+ Considering Small 3.2 [boasts about repetition reduction](https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506), I figured this was the time to train it on the very work I've been focusing on - systematic pattern diversity!
37
+
38
+ This finetune combines approximately 39 million tokens of carefully curated data:
39
+
40
+ - GPT 4.1 Instruct core for clean instruction following
41
+ - DeepSeek V3/R1 roleplay data
42
+ - Curated "best of" Pantheon interactions
43
+ - Diverse text adventure compilations
44
+
45
+ Each dataset component was specifically validated for structural variance - rarely starting responses the same way, featuring diverse sentence patterns and 10-40 turn conversations. This builds on months of diversity optimization research aimed at breaking common AI response patterns. It's been...quite a journey.
46
+
47
+ About half of the roleplay dataset is in Markdown asterisk format, but the majority of the other data is written in a narrative (book-style) present tense, second person perspective format.
48
+
49
+ ## Inference
50
+
51
+ Mistral really loves recommending unusual inference settings but I've been getting decent results with the settings below:
52
+
53
+ ```
54
+ "temperature": 0.8,
55
+ "repetition_penalty": 1.05,
56
+ "min_p": 0.05
57
+ ```
58
+
59
+ Having character names in front of messages is not a requirement but remains a personal recommendation of mine - it seems to help the model focus more on the character(s) in question. World-focused text adventures do fine without it.
60
+
61
+ ## Prompt Format
62
+
63
+ The model was trained using ChatML.
64
+
65
+ ```
66
+ <|im_start|>system
67
+ SYSTEM MESSAGE GOES HERE<|im_end|>
68
+ <|im_start|>user
69
+ USER MESSAGE GOES HERE<|im_end|>
70
+ <|im_start|>assistant
71
+ Character:
72
+ ```
73
+
74
+ ## Credits
75
+
76
+ - Everyone from [Anthracite](https://huggingface.co/anthracite-org)! Hi, guys!
77
+ - [Latitude](https://huggingface.co/LatitudeGames), who decided to take me on as a finetuner and gave me the chance to accumulate even more experience in this fascinating field
78
+ - All the folks I chat with on a daily basis on Discord! You know who you are.
79
+ - Anyone I forgot to mention, just in case!