isiZulu AFRIMMLU Fine-tuned Model

Fine-tuned Gemma-3-4b-it model for isiZulu AFRIMMLU multiple-choice question answering.

Model Details

  • Base Model: google/gemma-3-4b-it
  • Task: isiZulu multiple-choice question answering
  • Training Data: AFRIMMLU isiZulu dev + validation splits (108 examples)
  • Method: LoRA fine-tuning with instruction tuning format

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha: 16
  • Learning Rate: 1e-05
  • Epochs: 2
  • Batch Size: 16 (effective)

Prompt Format

Answer the following multiple-choice question about [subject].

Question: [isiZulu question]

Choices:
A) [choice 1]
B) [choice 2]
C) [choice 3]
D) [choice 4]

Select the correct answer (A, B, C, or D) first before you explain. The first character in your answer should be your choice (A,B,C, or D):

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-4b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")
model = PeftModel.from_pretrained(base_model, "Dineochiloane/gemma-3-4b-isizulu-afrimmlu")

# Format question
messages = [{
    "role": "user", 
    "content": "Answer the following multiple-choice question about elementary mathematics.\n\nQuestion: Lithini inani lika p ku 24 = 2p?\n\nChoices:\nA) p = 4\nB) p = 8\nC) p = 12\nD) p = 24\n\nSelect the correct answer (A, B, C, or D) first before you explain. The first character in your answer should be your choice (A,B,C, or D):"
}]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=100, temperature=0.0)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Research Context

This model was fine-tuned as part of research on cross-lingual transfer learning for African languages, specifically comparing zero-shot vs fine-tuned performance on isiZulu AFRIMMLU.

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dineochiloane/gemma-3-4b-isizulu-afrimmlu

Adapter
(93)
this model

Dataset used to train Dineochiloane/gemma-3-4b-isizulu-afrimmlu