arsath-sm/gemma3-tamil-translator
Fine-tuned Gemma-3 4B Instruct model specialized in high-quality English → Tamil translation
Trained using Unsloth + LoRA on a single T4 GPU (Colab Free tier)
BLEU Score: 36.12 | chrF++ Score: 65.25 (on 204-sentence held-out test set)
This model significantly outperforms the base Gemma-3 4B and even Llama-3 8B Instruct on English-to-Tamil translation.
Model Details
| Attribute | Value |
|---|---|
| Base Model | unsloth/gemma-3-4b-it-bnb-4bit |
| Architecture | Gemma-3 4B Instruct (8K context) |
| Fine-tuning Method | LoRA (r=16, alpha=16, dropout=0) via Unsloth |
| Training Data | ~1,836 high-quality English–Tamil parallel sentences (curated) |
| Training Epochs | 2 epochs |
| Batch Size | 8 (2 per device × 4 gradient accumulation) |
| Learning Rate | 2e-4 |
| Quantization | 4-bit during training, merged & saved in full precision (.safetensors) |
| Training Environment | Google Colab (T4 GPU, 15GB VRAM) |
| Training Time | ~2.5 hours |
Evaluation Results (204-sentence test set)
| Model | BLEU | chrF++ |
|---|---|---|
| arsath-sm/gemma3-tamil-translator (this model) | 36.12 | 65.25 |
| Base Gemma-3 4B (untuned) | 28.84 | 59.95 |
| Llama-3 8B Instruct | 2.94 | 21.35 |
+25% relative improvement in BLEU over the base model!
Intended Use
- Accurate English → Tamil translation
- Chat-style translation assistant
- Integration into Tamil NLP apps, chatbots, education tools
- Research on low-resource Indic language translation
Usage Example
from transformers import pipeline
translator = pipeline(
"text-generation",
model="arsath-sm/gemma3-tamil-translator",
device=0 # GPU
)
prompt = """You are a highly skilled translator. Translate the following English text to Tamil accurately and naturally.
English: Thank you so much for your help today."""
output = translator(prompt, max_new_tokens=128, do_sample=False)[0]["generated_text"]
print(output.split("Tamil:")[-1].strip())
# → இன்று உங்கள் உதவிக்கு மிக்க நன்றி.
- Downloads last month
- 19
Evaluation results
- BLEU on Custom English-Tamil Test Set (204 sentences)self-reported36.120
- chrF++ on Custom English-Tamil Test Set (204 sentences)self-reported65.250