IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation Paper • 2605.14712 • Published 3 days ago • 14
FrameSkip: Learning from Fewer but More Informative Frames in VLA Training Paper • 2605.13757 • Published 4 days ago • 19
TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model Paper • 2510.16449 • Published Oct 18, 2025 • 35
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 76
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers Paper • 2601.14133 • Published Jan 20 • 61
BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries Paper • 2601.15197 • Published Jan 21 • 54
ScalSelect: Scalable Training-Free Multimodal Data Selection for Efficient Visual Instruction Tuning Paper • 2602.11636 • Published Feb 12 • 2
ScalSelect: Scalable Training-Free Multimodal Data Selection for Efficient Visual Instruction Tuning Paper • 2602.11636 • Published Feb 12 • 2
ScalSelect: Scalable Training-Free Multimodal Data Selection for Efficient Visual Instruction Tuning Paper • 2602.11636 • Published Feb 12 • 2
DynaSolidGeo: A Dynamic Benchmark for Genuine Spatial Mathematical Reasoning of VLMs in Solid Geometry Paper • 2510.22340 • Published Oct 25, 2025 • 1
BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries Paper • 2601.15197 • Published Jan 21 • 54
liuhaotian/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5 Text Generation • Updated Oct 5, 2023 • 740 • 23
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 76
TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model Paper • 2510.16449 • Published Oct 18, 2025 • 35