Bidirectional isiZulu↔English Translation Model

Fine-tuned model for bidirectional translation between isiZulu and English with improved hyperparameters.

Model Details

  • Base Model: google/gemma-3-4b-it
  • Task: Bidirectional isiZulu ↔ English Translation
  • Training Examples: 50,000 (both directions)
  • Prompt Formats:
    • "Translate this from isiZulu to English: [text]"
    • "Translate this from English to isiZulu: [text]"

Training Configuration

LoRA Parameters

  • Rank: 16
  • Alpha: 16
  • Dropout: 0.1
  • Target Modules: all-linear

Training Parameters

  • Learning Rate: 0.0001
  • Epochs: 3
  • Batch Size: 8
  • Gradient Accumulation: 8
  • Effective Batch Size: 64

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-4b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")
model = PeftModel.from_pretrained(base_model, "Dineochiloane/gemma-3-4b-it-inkuba")

# Translate Zulu to English
messages = [{"role": "user", "content": "Translate this from isiZulu to English: Ngiyabonga kakhulu"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=50, temperature=0.7, repetition_penalty=1.2)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

# Translate English to Zulu
messages = [{"role": "user", "content": "Translate this from English to isiZulu: Thank you very much"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=50, temperature=0.7, repetition_penalty=1.2)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Dataset Information

  • Source: lelapa/Inkuba-instruct (isiZulu train split)
  • Filtering: MMT task + contains "isingisi" (English)
  • Training Strategy: Bidirectional (both Zulu→English and English→Zulu)
  • Original Examples: 25,000
  • Total Training Examples: 50,000 (doubled for bidirectionality)

Improvements in Bidirectional Version

  • Bidirectional capability: Can translate both Zulu→English and English→Zulu
  • Improved hyperparameters: Lower learning rate and higher dropout for better generalization
  • Reduced epochs: Compensates for doubled training data
  • Better generation: Recommended to use temperature=0.7 and repetition_penalty=1.2
Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dineochiloane/gemma-3-4b-it-inkuba

Adapter
(66)
this model

Dataset used to train Dineochiloane/gemma-3-4b-it-inkuba