Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception Paper โข 2602.11858 โข Published 6 days ago โข 55
ScaleEnv: Scaling Environment Synthesis from Scratch for Generalist Interactive Tool-Use Agent Training Paper โข 2602.06820 โข Published 12 days ago โข 13
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper โข 2602.03048 โข Published 15 days ago โข 33
V_0: A Generalist Value Model for Any Policy at State Zero Paper โข 2602.03584 โข Published 15 days ago โข 21
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper โข 2505.22334 โข Published May 28, 2025 โข 36
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper โข 2505.22453 โข Published May 28, 2025 โข 46
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Paper โข 2505.02567 โข Published May 5, 2025 โข 80