Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 27 days ago • 128
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published Nov 25, 2025 • 182
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published 20 days ago • 67
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion Paper • 2512.19678 • Published 14 days ago • 29
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 19 days ago • 91
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 13 days ago • 49
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published 12 days ago • 21
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 17 days ago • 95
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding Paper • 2512.17220 • Published 18 days ago • 107
Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation Paper • 2512.17040 • Published 18 days ago • 27
Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure Paper • 2512.14336 • Published 20 days ago • 28
PHUMA: Physically-Grounded Humanoid Locomotion Dataset Paper • 2510.26236 • Published Oct 30, 2025 • 29
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 28 days ago • 116
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 244
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 224
ACG: Action Coherence Guidance for Flow-based VLA models Paper • 2510.22201 • Published Oct 25, 2025 • 36
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published Sep 11, 2025 • 243