Curation of resources used in the paper "Demystifying Long Chain-of-Thought Reasoning in LLMs"
demystify-long-cot
community
AI & ML interests
None defined yet.
models 29
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n2-raw-sft-ppo
8B • Updated
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n1-raw-sft-ppo
8B • Updated
demystify-long-cot/llama-3.1-8b-webit462k-qwq-n8-rft
Updated
demystify-long-cot/llama-3.1-8b-webit462k-qwq-n4-rft
Updated
demystify-long-cot/llama-3.1-8b-webit462k-qwq-n2-rft
8B • Updated
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n8-rft
8B • Updated
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n4-rft
8B • Updated
demystify-long-cot/llama-3.1-8b-webit462k-qwq-n1-raw-sft
8B • Updated
• 1
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n4-raw-sft
8B • Updated
demystify-long-cot/llama-3.1-8b-webit231k-qwq-n2-raw-sft
8B • Updated
• 1
datasets 11
demystify-long-cot/math-train-action-n40
Viewer
• Updated
• 217k • 5
demystify-long-cot/math-train-qwen-rs-n256
Viewer
• Updated
• 1.53M • 8
demystify-long-cot/math-train-qwen-rs-n128
Viewer
• Updated
• 766k • 3
demystify-long-cot/math-train-qwen-rs-n64
Viewer
• Updated
• 383k • 13
demystify-long-cot/math-train-qwen-rs-n32
Viewer
• Updated
• 192k • 5
demystify-long-cot/math-train-qwq-rs-n256
Viewer
• Updated
• 1.14M • 11 • 1
demystify-long-cot/math-train-qwq-rs-n192
Viewer
• Updated
• 854k • 7
demystify-long-cot/math-train-qwq-rs-n128
Viewer
• Updated
• 854k • 3
demystify-long-cot/math-train-qwq-rs-n64
Viewer
• Updated
• 428k • 6
demystify-long-cot/math-train-qwq-rs-n32
Preview
• Updated
• 7