AlignGuard: Scalable Safety Alignment for Text-to-Image Generation Paper • 2412.10493 • Published Dec 13, 2024
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 4 days ago • 48
Latent Guard: a Safety Framework for Text-to-image Generation Paper • 2404.08031 • Published Apr 11, 2024
Fake it till You Make it: Reward Modeling as Discriminative Prediction Paper • 2506.13846 • Published Jun 16
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 10 days ago • 73
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 4 days ago • 48
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper • 2512.17532 • Published 8 days ago • 63
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 9 days ago • 19
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26 • 109
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 111
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 143