HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images Paper • 2603.02210 • Published 4 days ago • 24
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published 10 days ago • 130
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era Paper • 2602.23452 • Published 8 days ago • 16
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 9 days ago • 52
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published 25 days ago • 50
TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions Paper • 2602.08711 • Published 25 days ago • 28
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 26 days ago • 69
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published 29 days ago • 36
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published Feb 3 • 62
NativeTok: Native Visual Tokenization for Improved Image Generation Paper • 2601.22837 • Published Jan 30 • 9
DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning Paper • 2601.21716 • Published Jan 29 • 13
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published Jan 20 • 47
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation Paper • 2601.10880 • Published Jan 15 • 15
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published Jan 14 • 33
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published Jan 13 • 34