In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published 12 days ago • 29
HoliTom: Holistic Token Merging for Fast Video Large Language Models Paper • 2505.21334 • Published May 27 • 21
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression Paper • 2505.19602 • Published May 26 • 13
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published May 24 • 25
VeriThinker: Learning to Verify Makes Reasoning Model Efficient Paper • 2505.17941 • Published May 23 • 25
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning Paper • 2505.16400 • Published May 22 • 35