10 19 4

Ye Liu

yeliudev

Apel-sin's profile picture

ari0312's profile picture

elejke's profile picture

https://yeliu.dev/

yeliudev
yeliudev

AI & ML interests

Vision & Language

Recent Activity

upvoted a paper 6 days ago

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

updated a Space 13 days ago

PolyU-ChenLab/Video-Highlights

upvoted a paper about 1 month ago

Mixture-of-Depths Attention

View all activity

Organizations

yeliudev 's collections 4

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

Agents

37

VideoMind 2B

💡

37

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated Jan 27 • 25 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated Jan 27 • 45 • 4
yeliudev/VideoMind-Dataset

Preview • Updated Jan 27 • 4.36k • 21

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 113 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 355 • 5
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 18 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 8

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

Agents

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 118 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 51 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 132 • 2

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

Agents

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024 • 1

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

Agents

37

VideoMind 2B

💡

37

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated Jan 27 • 25 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated Jan 27 • 45 • 4
yeliudev/VideoMind-Dataset

Preview • Updated Jan 27 • 4.36k • 21

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

Agents

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 118 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 51 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 132 • 2

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 113 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 355 • 5
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 18 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 8

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

Agents

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024 • 1

Ye Liu

AI & ML interests

Recent Activity

Organizations

yeliudev 's collections 4

VideoMind 2B

UniPixel

R2-Tuning

VideoMind 2B

UniPixel

R2-Tuning