Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 8 days ago • 82
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 8 days ago • 82
RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing Paper • 2507.20352 • Published Jul 27