olmo-gsm8k-finetuned

This is a fine-tuned version of allenai/OLMoE-1B-7B-0125-Instruct using LoRA (Low-Rank Adaptation) for mathematical reasoning on the GSM8K dataset.

Model Details

  • Base Model: allenai/OLMoE-1B-7B-0125-Instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: GSM8K (Grade School Math 8K)
  • Task: Mathematical reasoning and problem-solving

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMoE-1B-7B-0125-Instruct")
base_model = AutoModelForCausalLM.from_pretrained(
    "allenai/OLMoE-1B-7B-0125-Instruct",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "yassine-boua/olmo-gsm8k-finetuned")

# Example usage
def generate_answer(question: str):
    messages = [
        {"role": "system", "content": "You are a helpful math assistant. Think step by step and provide your final answer in <answer></answer> tags."},
        {"role": "user", "content": question}
    ]
    
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=200,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id
        )
    
    response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
    return response.strip()

# Test the model
question = "What is 15% of 240?"
answer = generate_answer(question)
print(answer)

Training Details

  • Training Method: LoRA fine-tuning
  • Target Dataset: GSM8K
  • Optimization: GRPO (Group Relative Policy Optimization)

Performance

This model has been fine-tuned specifically for mathematical reasoning tasks and should perform well on grade school math problems.

Limitations

  • The model's performance is limited to the scope of the training data
  • May not generalize well to advanced mathematical concepts beyond grade school level
  • Inherits limitations from the base model

Citation

If you use this model, please cite the original OLMoE paper and the GSM8K dataset:

@article{olmo2024,
  title={OLMo: Accelerating the Science of Language Models},
  author={Dolan, Sam and et al.},
  journal={arXiv preprint arXiv:2402.08019},
  year={2024}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yassine-boua/olmo-gsm8k-finetuned