Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation Paper • 2407.15282 • Published Jul 21, 2024
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans Paper • 2507.02861 • Published Jul 3, 2025 • 1
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving Paper • 2310.08370 • Published Oct 12, 2023
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning Paper • 2506.22434 • Published Jun 27, 2025 • 10
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm Paper • 2310.08586 • Published Oct 12, 2023
VIRT: Vision Instructed Transformer for Robotic Manipulation Paper • 2410.07169 • Published Oct 9, 2024
MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds Paper • 2307.09316 • Published Jul 18, 2023 • 1
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model Paper • 2310.01412 • Published Oct 2, 2023 • 1
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 179
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 179
Mitigating Object Hallucinations via Sentence-Level Early Intervention Paper • 2507.12455 • Published Jul 16, 2025 • 9
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 79
Sonata: Self-Supervised Learning of Reliable Point Representations Paper • 2503.16429 • Published Mar 20, 2025 • 13
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model Paper • 2312.17240 • Published Dec 28, 2023 • 1