SII-LeeSXian's picture

9 2

SII-LeeSXian

LEE0v0

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

OpenMOSS-Team/MOSS-TTS

liked a model 1 day ago

OpenMOSS-Team/MOVA-360p

upvoted a paper 5 days ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Paper • 2602.07026 • Published 12 days ago • 133

upvoted a paper 9 days ago

Steering LLMs via Scalable Interactive Oversight

Paper • 2602.04210 • Published 11 days ago • 18

upvoted a paper about 1 month ago

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published Jan 11 • 211

upvoted a paper 3 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 216

upvoted a paper 4 months ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published Oct 27, 2025 • 56

upvoted a collection 6 months ago

EO-Robotics

EmbodiedOneVision is a unified framework for multimodal embodied reasoning and robot control, featuring interleaved vision-text-action pretraining. • 8 items • Updated Dec 7, 2025 • 8

upvoted 2 papers 11 months ago

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Paper • 2503.22655 • Published Mar 28, 2025 • 39

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Paper • 2503.06053 • Published Mar 8, 2025 • 138

upvoted a paper over 1 year ago

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Paper • 2407.05131 • Published Jul 6, 2024 • 26