koutch/short_paper_llama_2.json_train_dpo_v1_train_no_think Text Generation • 8B • Updated about 18 hours ago • 37
koutch/short_paper_llama_2.json_train_dpo_v2_train_no_think Text Generation • 8B • Updated about 18 hours ago • 29
koutch/short_paper_qwen_2.json_train_dpo_v2_train_no_think Text Generation • 4B • Updated about 20 hours ago • 28
koutch/short_paper_qwen_2.json_train_dpo_v1_train_no_think Text Generation • 4B • Updated about 20 hours ago • 22
koutch/short_paper_llama_llama3.1-8b_train_sft_all_train_no_think Text Generation • 8B • Updated about 21 hours ago • 125
koutch/short_paper_llama_llama3.1-8b_train_sft_train_no_think Text Generation • 8B • Updated about 21 hours ago • 271
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_all_train_no_think Text Generation • 4B • Updated about 21 hours ago • 100
koutch/short_paper_llama_llama3.1-8b_train_sft_train_para Text Generation • 8B • Updated about 21 hours ago • 146
koutch/short_paper_smol_2.json_train_dpo_v2_train_no_think Text Generation • 3B • Updated about 21 hours ago • 27
koutch/short_paper_smol_2.json_train_dpo_v1_train_no_think Text Generation • 3B • Updated about 21 hours ago • 28