VERL Math Datasets
High-quality math reasoning datasets in VERL format
Viewer • Updated • 2.27M • 62 • 1Note Unified collection of 2.1M deduplicated math problems (from 27.9M originals). Combines all 8 datasets below with 92.36% deduplication.
sungyub/mathx-5m-verl
Viewer • Updated • 1.45M • 11Note Largest contributor (67.6%). From XenArcAI/MathX-5M with 94.6% deduplication.
sungyub/eurus-2-math-verl
Viewer • Updated • 412k • 21Note Second largest (14.9%). Diverse math reasoning from PRIME-RL/Eurus-2-RL-Data.
sungyub/big-math-rl-verl
Viewer • Updated • 242k • 23Note Third largest (10.5%). Includes solve_rate and domain metadata for curriculum learning.
-
sungyub/deepmath-103k-verl
Viewer • Updated • 102k • 7
sungyub/skywork-or1-math-verl
Viewer • Updated • 103k • 36Note Includes model difficulty ratings. From Skywork/Skywork-OR1-RL-Data.
sungyub/openr1-math-verl
Viewer • Updated • 184k • 135Note Highest quality (0.44% duplicates). Includes problem_type metadata.
sungyub/orz-math-72k-verl
Viewer • Updated • 46.4k • 32Note Medium-scale dataset. Originally 72K with 33.5% duplicates removed.
sungyub/deepscaler-preview-verl
Viewer • Updated • 38k • 67Note Competition math (AIME, AMC, Omni-MATH). High preservation rate (94%).
sungyub/dapo-math-17k-verl
Viewer • Updated • 17.2k • 24Note Smallest but fully preserved. From BytedTsinghua-SIA/DAPO-Math-17k.