EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper
• 2508.21112 • Published
• 77
EmbodiedOneVision is a unified framework for multimodal embodied reasoning and robot control, featuring interleaved vision-text-action pretraining.