arsath-sm/gemma3-tamil-translator

Fine-tuned Gemma-3 4B Instruct model specialized in high-quality English → Tamil translation
Trained using Unsloth + LoRA on a single T4 GPU (Colab Free tier)

BLEU Score: 36.12 | chrF++ Score: 65.25 (on 204-sentence held-out test set)

This model significantly outperforms the base Gemma-3 4B and even Llama-3 8B Instruct on English-to-Tamil translation.

Model Details

Attribute Value
Base Model unsloth/gemma-3-4b-it-bnb-4bit
Architecture Gemma-3 4B Instruct (8K context)
Fine-tuning Method LoRA (r=16, alpha=16, dropout=0) via Unsloth
Training Data ~1,836 high-quality English–Tamil parallel sentences (curated)
Training Epochs 2 epochs
Batch Size 8 (2 per device × 4 gradient accumulation)
Learning Rate 2e-4
Quantization 4-bit during training, merged & saved in full precision (.safetensors)
Training Environment Google Colab (T4 GPU, 15GB VRAM)
Training Time ~2.5 hours

Evaluation Results (204-sentence test set)

Model BLEU chrF++
arsath-sm/gemma3-tamil-translator (this model) 36.12 65.25
Base Gemma-3 4B (untuned) 28.84 59.95
Llama-3 8B Instruct 2.94 21.35

+25% relative improvement in BLEU over the base model!

Intended Use

  • Accurate English → Tamil translation
  • Chat-style translation assistant
  • Integration into Tamil NLP apps, chatbots, education tools
  • Research on low-resource Indic language translation

Usage Example

from transformers import pipeline

translator = pipeline(
    "text-generation",
    model="arsath-sm/gemma3-tamil-translator",
    device=0  # GPU
)

prompt = """You are a highly skilled translator. Translate the following English text to Tamil accurately and naturally.

English: Thank you so much for your help today."""

output = translator(prompt, max_new_tokens=128, do_sample=False)[0]["generated_text"]
print(output.split("Tamil:")[-1].strip())
# → இன்று உங்கள் உதவிக்கு மிக்க நன்றி.
Downloads last month
19
Safetensors
Model size
4B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • BLEU on Custom English-Tamil Test Set (204 sentences)
    self-reported
    36.120
  • chrF++ on Custom English-Tamil Test Set (204 sentences)
    self-reported
    65.250