Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

	@@ -36,7 +36,7 @@ For full transparency and reproducibility, please refer to our technical report
36
37
38
39	- 🚀 The JT-Math-8B-Instruct is an 8-billion parameter language model built on the Jiutian LLM architecture with a context length of 32,768 tokens. Its development involved two key stages: initial pre-training of the JT-Math-8B-Base model on a diverse corpus of text and mathematical data, followed by a two-stage instruction tuning process. This tuning began with Supervised Fine-Tuning (SFT), where the model was trained on a high-quality, multilingual dataset of mathematical problems and solutions in both English and Chinese to grasp problem-solving patterns. Subsequently, Reinforcement Learning (RL) was applied ~~within an 8K context window~~ to enhance reasoning accuracy, minimize logical fallacies, and align the model more closely with human preferences for clear and correct mathematical solutions.
40
41
42


36
37
38
39	+ 🚀 The JT-Math-8B-Instruct is an 8-billion parameter language model built on the Jiutian LLM architecture with a context length of 32,768 tokens. Its development involved two key stages: initial pre-training of the JT-Math-8B-Base model on a diverse corpus of text and mathematical data, followed by a two-stage instruction tuning process. This tuning began with Supervised Fine-Tuning (SFT), where the model was trained on a high-quality, multilingual dataset of mathematical problems and solutions in both English and Chinese to grasp problem-solving patterns. Subsequently, Reinforcement Learning (RL) was applied to enhance reasoning accuracy, minimize logical fallacies, and align the model more closely with human preferences for clear and correct mathematical solutions.
40
41
42