Finetuning

#2
by AlexWortega - opened

First of all β€” awesome work!

Are you planning to publish any instructions, GitHub repo for fine-tuning?

Llama Factory and ms-swift are support, check our github

Llama Factory and ms-swift are support, check our github

Hey! Thanks for such a great work and good Russian language support in the model.
I have an additional question regarding fine-tuning, specifically using LoRA adapters.
Is it even worth doing considering sparse experts activation, small router size, etc.?

Do you have some recommendations about MoE-specific LoRA fine-tuning of that model?

Sign up or log in to comment