Create README.md

3bbb14f verified about 1 year ago

4.69 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- nvidia
	- AceInstruct
	- code
	- math
	- general_domain
	- instruct_model
	- pytorch
	---

	## Introduction
	We introduce AceInstruct, a family of advanced SFT models for coding, mathematics, and general-purpose tasks. The AceInstruct family, which includes AceInstruct-1.5B, 7B, and 72B, is <b>Improved using Qwen</b>.
	These models are fine-tuned on Qwen2.5-Base using [general SFT datasets](https://huggingface.co/datasets/nvidia/AceMath-Instruct-Training-Data). These same datasets are also used in the training of [AceMath-Instruct](https://huggingface.co/nvidia/AceMath-72B-Instruct). Different from AceMath-Instruct which is specialized for math questions, AceInstruct is versatile and can be applied to a wide range of domains. Benchmark evaluations across coding, mathematics, and general knowledge tasks demonstrate that AceInstruct delivers performance comparable to Qwen2.5-Instruct.

	For more information about AceInstruct, check our [website](https://research.nvidia.com/labs/adlr/acemath/) and [paper](https://arxiv.org/abs/2412.15084).


	## Benchmark Results
	\| \| Qwen2.5-1.5B-Instruct \| AceInstruct-1.5B \| Qwen2.5-7B-Instruct \| AceInstruct-7B \| Qwen2.5-72B-Instruct \| AceInstruct-72B \|
	\| --------- \|:-----:\|:-----:\|:-----:\|:-----:\|:-----:\|:-----:\|
	\| HumanEval \| 61.60 \| 73.17 \| 84.80 \| 85.37 \| 86.60 \| 89.63 \|
	\| MBPP \| 63.20 \| 65.76 \| 79.20 \| 74.32 \| 88.20 \| 83.66 \|
	\| GSM8K \| 73.20 \| 80.44 \| 91.60 \| 93.10 \| 95.80 \| 96.36 \|
	\| MATH \| 55.20 \| 60.34 \| 75.50 \| 76.40 \| 83.10 \| 84.50 \|
	\| MMLU \| 58.37 \| 58.17 \| 74.51 \| 74.68 \| 84.67 \| 83.88 \|
	\| MMLU Pro \| 32.40 \| 33.78 \| 56.30 \| 54.50 \| 71.10 \| 66.10 \|
	\| Average \| 57.33 \| 61.94 \| 76.99 \| 76.40 \| 84.91 \| 84.02 \|

	We compare AceInstruct to Qwen2.5-Instruct across coding, mathematics, and general knowledge tasks. We find that AceInstruct-1.5B outperforms Qwen2.5-1.5B-Instruct (61.94 vs. 57.33), while AceInstruct-7B and AceInstruct-72B perform similarly to Qwen2.5-7B-Instruct and Qwen2.5-72B-Instruct.


	## All Resources
	### AceMath Instruction Models
	- [AceMath-1.5B-Instruct](https://huggingface.co/nvidia/AceMath-1.5B-Instruct), [AceMath-7B-Instruct](https://huggingface.co/nvidia/AceMath-7B-Instruct), [AceMath-72B-Instruct](https://huggingface.co/nvidia/AceMath-72B-Instruct)

	### AceMath Reward Models
	- [AceMath-7B-RM](https://huggingface.co/nvidia/AceMath-7B-RM), [AceMath-72B-RM](https://huggingface.co/nvidia/AceMath-72B-RM)

	### Evaluation & Training Data
	- [AceMath-RewardBench](https://huggingface.co/datasets/nvidia/AceMath-RewardBench), [AceMath-Instruct Training Data](https://huggingface.co/datasets/nvidia/AceMath-Instruct-Training-Data), [AceMath-RM Training Data](https://huggingface.co/datasets/nvidia/AceMath-RM-Training-Data)

	### General Instruction Models
	- [AceInstruct-1.5B](https://huggingface.co/nvidia/AceInstruct-1.5B), [AceInstruct-7B](https://huggingface.co/nvidia/AceInstruct-7B), [AceInstruct-72B](https://huggingface.co/nvidia/AceInstruct-72B)


	## How to use
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "AceInstruct-7B"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

	prompt = "Tell me something about artificial intelligence."
	messages = [{"role": "user", "content": prompt}]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=1024
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```


	## Correspondence to
	Zihan Liu ([email protected]), Yang Chen ([email protected]), Wei Ping ([email protected])


	## Citation
	If you find our work helpful, we’d appreciate it if you could cite us.
	<pre>
	@article{acemath2024,
	title={AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling},
	author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
	journal={arXiv preprint},
	year={2024}
	}
	</pre>


	## License
	All models in the AceInstruct family are for non-commercial use only, subject to [Terms of Use](https://openai.com/policies/row-terms-of-use/) of the data generated by OpenAI. We put the AceInstruct models under the license of [Creative Commons Attribution: Non-Commercial 4.0 International](https://spdx.org/licenses/CC-BY-NC-4.0).