Finetuning

by AlexWortega - opened 9 days ago

Discussion

AlexWortega

9 days ago

First of all — awesome work!

Are you planning to publish any instructions, GitHub repo for fine-tuning?

ZAHNGYUXUAN

Z.ai org 9 days ago

Llama Factory and ms-swift are support, check our github

attn-signs

6 days ago

Llama Factory and ms-swift are support, check our github

Hey! Thanks for such a great work and good Russian language support in the model.
I have an additional question regarding fine-tuning, specifically using LoRA adapters.
Is it even worth doing considering sparse experts activation, small router size, etc.?

Do you have some recommendations about MoE-specific LoRA fine-tuning of that model?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment