Bidirectional isiZulu↔English Translation Model
Fine-tuned model for bidirectional translation between isiZulu and English with improved hyperparameters.
Model Details
- Base Model: google/gemma-3-4b-it
- Task: Bidirectional isiZulu ↔ English Translation
- Training Examples: 50,000 (both directions)
- Prompt Formats:
- "Translate this from isiZulu to English: [text]"
- "Translate this from English to isiZulu: [text]"
Training Configuration
LoRA Parameters
- Rank: 16
- Alpha: 16
- Dropout: 0.1
- Target Modules: all-linear
Training Parameters
- Learning Rate: 0.0001
- Epochs: 3
- Batch Size: 8
- Gradient Accumulation: 8
- Effective Batch Size: 64
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-4b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")
model = PeftModel.from_pretrained(base_model, "Dineochiloane/gemma-3-4b-it-inkuba")
# Translate Zulu to English
messages = [{"role": "user", "content": "Translate this from isiZulu to English: Ngiyabonga kakhulu"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=50, temperature=0.7, repetition_penalty=1.2)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Translate English to Zulu
messages = [{"role": "user", "content": "Translate this from English to isiZulu: Thank you very much"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=50, temperature=0.7, repetition_penalty=1.2)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Dataset Information
- Source: lelapa/Inkuba-instruct (isiZulu train split)
- Filtering: MMT task + contains "isingisi" (English)
- Training Strategy: Bidirectional (both Zulu→English and English→Zulu)
- Original Examples: 25,000
- Total Training Examples: 50,000 (doubled for bidirectionality)
Improvements in Bidirectional Version
- Bidirectional capability: Can translate both Zulu→English and English→Zulu
- Improved hyperparameters: Lower learning rate and higher dropout for better generalization
- Reduced epochs: Compensates for doubled training data
- Better generation: Recommended to use temperature=0.7 and repetition_penalty=1.2
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support