Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6 • 494
Running 3.55k The Ultra-Scale Playbook 🌌 3.55k The ultimate guide to training LLM on large GPU Clusters
Atla Selene Mini: A General Purpose Evaluation Model Paper • 2501.17195 • Published Jan 27 • 35