Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 16 days ago • 191
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published 16 days ago • 60
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 21 days ago • 228
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 21 days ago • 110
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published 28 days ago • 71
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 Sentence Similarity • 0.1B • Updated Jan 28 • 49.6M • • 1.24k
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
sentence-transformers/all-MiniLM-L6-v2 Sentence Similarity • 22.7M • Updated Mar 6, 2025 • 263M • • 4.84k