olmo-gsm8k-finetuned

This is a fine-tuned version of allenai/OLMoE-1B-7B-0125-Instruct using LoRA (Low-Rank Adaptation) for mathematical reasoning on the GSM8K dataset.

Model Details

Base Model: allenai/OLMoE-1B-7B-0125-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Dataset: GSM8K (Grade School Math 8K)
Task: Mathematical reasoning and problem-solving

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMoE-1B-7B-0125-Instruct")
base_model = AutoModelForCausalLM.from_pretrained(
    "allenai/OLMoE-1B-7B-0125-Instruct",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "yassine-boua/olmo-gsm8k-finetuned")

# Example usage
def generate_answer(question: str):
    messages = [
        {"role": "system", "content": "You are a helpful math assistant. Think step by step and provide your final answer in <answer></answer> tags."},
        {"role": "user", "content": question}
    ]
    
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=200,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id
        )
    
    response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
    return response.strip()

# Test the model
question = "What is 15% of 240?"
answer = generate_answer(question)
print(answer)

Training Details

Training Method: LoRA fine-tuning
Target Dataset: GSM8K
Optimization: GRPO (Group Relative Policy Optimization)

Performance

This model has been fine-tuned specifically for mathematical reasoning tasks and should perform well on grade school math problems.

Limitations

The model's performance is limited to the scope of the training data
May not generalize well to advanced mathematical concepts beyond grade school level
Inherits limitations from the base model

Citation

If you use this model, please cite the original OLMoE paper and the GSM8K dataset:

@article{olmo2024,
  title={OLMo: Accelerating the Science of Language Models},
  author={Dolan, Sam and et al.},
  journal={arXiv preprint arXiv:2402.08019},
  year={2024}
}

Downloads last month: 2

Model tree for yassine-boua/olmo-gsm8k-finetuned

Base model

allenai/OLMoE-1B-7B-0125

Finetuned

allenai/OLMoE-1B-7B-0125-SFT

Finetuned

allenai/OLMoE-1B-7B-0125-DPO

Finetuned

allenai/OLMoE-1B-7B-0125-Instruct

Adapter

(2)

this model