Text Generation
Transformers
Safetensors
PyTorch
nvidia
conversational

Any plans to release the training recipe?

#21
by nskwal - opened

Are there any plans to release the training recipe and configuration used with Megatron-LM?

NVIDIA org

Have you seen this https://arxiv.org/pdf/2508.14444 ?

@okuchaiev Is there any detailed scripts regarding how to generate training data and how to perform phased pre-training & sft to reproduce the metrics and conclusions of the paper?

Sign up or log in to comment