olmo-gsm8k-finetuned
This is a fine-tuned version of allenai/OLMoE-1B-7B-0125-Instruct using LoRA (Low-Rank Adaptation) for mathematical reasoning on the GSM8K dataset.
Model Details
- Base Model: allenai/OLMoE-1B-7B-0125-Instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: GSM8K (Grade School Math 8K)
- Task: Mathematical reasoning and problem-solving
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMoE-1B-7B-0125-Instruct")
base_model = AutoModelForCausalLM.from_pretrained(
"allenai/OLMoE-1B-7B-0125-Instruct",
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "yassine-boua/olmo-gsm8k-finetuned")
# Example usage
def generate_answer(question: str):
messages = [
{"role": "system", "content": "You are a helpful math assistant. Think step by step and provide your final answer in <answer></answer> tags."},
{"role": "user", "content": question}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.pad_token_id
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
return response.strip()
# Test the model
question = "What is 15% of 240?"
answer = generate_answer(question)
print(answer)
Training Details
- Training Method: LoRA fine-tuning
- Target Dataset: GSM8K
- Optimization: GRPO (Group Relative Policy Optimization)
Performance
This model has been fine-tuned specifically for mathematical reasoning tasks and should perform well on grade school math problems.
Limitations
- The model's performance is limited to the scope of the training data
- May not generalize well to advanced mathematical concepts beyond grade school level
- Inherits limitations from the base model
Citation
If you use this model, please cite the original OLMoE paper and the GSM8K dataset:
@article{olmo2024,
title={OLMo: Accelerating the Science of Language Models},
author={Dolan, Sam and et al.},
journal={arXiv preprint arXiv:2402.08019},
year={2024}
}
- Downloads last month
- 2
Model tree for yassine-boua/olmo-gsm8k-finetuned
Base model
allenai/OLMoE-1B-7B-0125
Finetuned
allenai/OLMoE-1B-7B-0125-SFT
Finetuned
allenai/OLMoE-1B-7B-0125-DPO
Finetuned
allenai/OLMoE-1B-7B-0125-Instruct