Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 9.38k • 177 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 7.79k • 180 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 96 • 37 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 1.17k • 10
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 9.82k • 162 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 116 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 9.82k • 162
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 116 • 5
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 27 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 482 • 171
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • Updated Apr 8, 2025 • 25.1k • • 116
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 30 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 10 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 35.3k • • 211
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 35.3k • • 211
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 3.35k • 659 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 6.37k • 216 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 8.17k • 536 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 3.19k • 54
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 10.9k • 168 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 22.9k • 240 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 12.2k • 455 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 196 • 20
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 20.5k • 393 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 586 • 11 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 272 • 3
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • Updated May 11, 2025 • 546 • • 681 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 1.28M • • 2.02k
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 30 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 10 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 35.3k • • 211
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 35.3k • • 211
Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 9.38k • 177 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 7.79k • 180 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 96 • 37 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 1.17k • 10
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 3.35k • 659 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 6.37k • 216 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 8.17k • 536 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 3.19k • 54
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 9.82k • 162 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 116 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 9.82k • 162
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 116 • 5
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 10.9k • 168 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 22.9k • 240 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 12.2k • 455 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 196 • 20
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 20.5k • 393 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 586 • 11 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 272 • 3
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 27 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 482 • 171
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • Updated May 11, 2025 • 546 • • 681 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 1.28M • • 2.02k
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • Updated Apr 8, 2025 • 25.1k • • 116