Finetuning
#2
by
AlexWortega
- opened
First of all β awesome work!
Are you planning to publish any instructions, GitHub repo for fine-tuning?
Llama Factory and ms-swift are support, check our github
Llama Factory and ms-swift are support, check our github
Hey! Thanks for such a great work and good Russian language support in the model.
I have an additional question regarding fine-tuning, specifically using LoRA adapters.
Is it even worth doing considering sparse experts activation, small router size, etc.?
Do you have some recommendations about MoE-specific LoRA fine-tuning of that model?