InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning Paper • 2601.14209 • Published Jan 20 • 6
Failure-Prefix Conditioning Collection Collection for the paper: Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning • 5 items • Updated 16 days ago
guactastesgood/failure-prefix-conditioned-dataset-iteration-2 Viewer • Updated 16 days ago • 1.12k • 20
guactastesgood/failure-prefix-conditioned-dataset-iteration-2 Viewer • Updated 16 days ago • 1.12k • 20
Failure-Prefix Conditioning Collection Collection for the paper: Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning • 5 items • Updated 16 days ago
guactastesgood/DeepSeek-R1-Distill-Qwen-1.5B-failure-prefix-conditioning-iteration2 Text Generation • 2B • Updated 16 days ago • 22
guactastesgood/DeepSeek-R1-Distill-Qwen-1.5B-failure-prefix-conditioning-iteration2 Text Generation • 2B • Updated 16 days ago • 22
Failure-Prefix Conditioning Collection Collection for the paper: Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning • 5 items • Updated 16 days ago
guactastesgood/DeepSeek-R1-Distill-Qwen-1.5B-failure-prefix-conditioning-iteration1 2B • Updated 16 days ago • 28
On the Limits of Layer Pruning for Generative Reasoning in LLMs Paper • 2602.01997 • Published 18 days ago • 4
On the Limits of Layer Pruning for Generative Reasoning in LLMs Paper • 2602.01997 • Published 18 days ago • 4
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 23 days ago • 6
guactastesgood/DeepSeek-R1-Distill-Qwen-1.5B-failure-prefix-conditioning-iteration1 2B • Updated 16 days ago • 28
Failure-Prefix Conditioning Collection Collection for the paper: Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning • 5 items • Updated 16 days ago
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 23 days ago • 6
Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning Paper • 2601.20829 • Published 23 days ago • 6
Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning Paper • 2505.14216 • Published May 20, 2025 • 2