osieosie/mixed_olympiads_paraphrased_32_s1_tulu3_sft_s1_10.0pct Viewer • Updated 11 days ago • 320 • 26
osieosie/mixed_olympiads_paraphrased_32_s1_tulu3_sft_s1_10.0pct Viewer • Updated 11 days ago • 320 • 26
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 12 days ago • 54
osieosie/tulu-2-7b_mixed_tulu3-sft_olympiads_32_seed1_original_320ex_10pct_e1_lr5e-06_bs128_constant Text Generation • 7B • Updated 12 days ago • 8