NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 11 days ago • 39 • 3
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 7 days ago • 151 • 5
Breaking the Sorting Barrier for Directed Single-Source Shortest Paths Paper • 2504.17033 • Published Apr 23, 2025 • 1
Transolver: A Fast Transformer Solver for PDEs on General Geometries Paper • 2402.02366 • Published Feb 4, 2024 • 1
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 132 • 19
ImLoc: Revisiting Visual Localization with Image-based Representation Paper • 2601.04185 • Published 8 days ago • 2 • 1
dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model Paper • 2512.02498 • Published Dec 2, 2025 • 1
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 9 days ago • 101 • 6
RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes Paper • 2601.05249 • Published 7 days ago • 43 • 3
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 10 days ago • 93 • 8
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published 14 days ago • 116 • 4
Effort: Efficient Orthogonal Modeling for Generalizable AI-Generated Image Detection Paper • 2411.15633 • Published Nov 23, 2024 • 1
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 9 days ago • 95 • 9
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation Paper • 2512.24271 • Published 16 days ago • 59 • 6
Can We Trust AI Explanations? Evidence of Systematic Underreporting in Chain-of-Thought Reasoning Paper • 2601.00830 • Published 21 days ago • 2 • 3