-
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Paper • 2509.16197 • Published • 57 -
InternRobotics/VLAC
Robotics • 2B • Updated • 47 • 40 -
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence
Paper • 2509.12203 • Published • 20 -
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
Paper • 2509.15937 • Published • 20
fysp
fysp
·
AI & ML interests
tech, ai, climate, social, disrupt
Recent Activity
liked
a model
2 days ago
moonshotai/Kimi-K2.5
liked
a model
11 days ago
zai-org/GLM-4.7-Flash
liked
a model
14 days ago
zai-org/GLM-Image