lyk2586 commited on
Commit
ff1a396
·
verified ·
1 Parent(s): b076a40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -36,7 +36,7 @@ For full transparency and reproducibility, please refer to our technical report
36
 
37
 
38
 
39
- 🚀 The **JT-Math-8B-Instruct** is an 8-billion parameter language model built on the **Jiutian LLM architecture** with a **context length of 32,768 tokens**. Its development involved two key stages: initial pre-training of the **JT-Math-8B-Base** model on a diverse corpus of text and mathematical data, followed by a two-stage instruction tuning process. This tuning began with **Supervised Fine-Tuning (SFT)**, where the model was trained on a high-quality, multilingual dataset of mathematical problems and solutions in both English and Chinese to grasp problem-solving patterns. Subsequently, **Reinforcement Learning (RL)** was applied within an 8K context window to enhance reasoning accuracy, minimize logical fallacies, and align the model more closely with human preferences for clear and correct mathematical solutions.
40
 
41
 
42
 
 
36
 
37
 
38
 
39
+ 🚀 The **JT-Math-8B-Instruct** is an 8-billion parameter language model built on the **Jiutian LLM architecture** with a **context length of 32,768 tokens**. Its development involved two key stages: initial pre-training of the **JT-Math-8B-Base** model on a diverse corpus of text and mathematical data, followed by a two-stage instruction tuning process. This tuning began with **Supervised Fine-Tuning (SFT)**, where the model was trained on a high-quality, multilingual dataset of mathematical problems and solutions in both English and Chinese to grasp problem-solving patterns. Subsequently, **Reinforcement Learning (RL)** was applied to enhance reasoning accuracy, minimize logical fallacies, and align the model more closely with human preferences for clear and correct mathematical solutions.
40
 
41
 
42