9 15

Xinyu Zhang

gaohaoyu

AI & ML interests

None yet

Recent Activity

upvoted a paper about 5 hours ago

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

upvoted a paper 2 days ago

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

liked a Space 4 days ago

AimeeBingmouQu/ProtectBirds

View all activity

Organizations

None yet

upvoted a paper about 5 hours ago

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Paper • 2606.02564 • Published 1 day ago • 20

upvoted a paper 2 days ago

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

Paper • 2605.27310 • Published 8 days ago • 20

liked a Space 4 days ago

ProtectBirds

🏃

335

Protect Birds

liked a model 6 days ago

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 8.29M • • 929

liked a dataset 10 days ago

nick007x/arxiv-papers

Viewer • Updated Apr 1 • 2.55M • 891k • 192

upvoted a paper 10 days ago

IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools

Paper • 2605.20682 • Published 14 days ago • 83

liked 3 models 11 days ago

liked a dataset 12 days ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 697k • 144

upvoted a paper 15 days ago

WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild

Paper • 2605.01018 • Published May 1 • 9

upvoted a paper 19 days ago

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Paper • 2605.12684 • Published 22 days ago • 11

liked a model 22 days ago

bihungba1101/vocab-coedit-qwen3.5-0.8b-sft

Updated 22 days ago • 1

liked a model 26 days ago

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 10.7M • • 5.97k

upvoted a paper about 1 month ago

A Survey on LLM-based Conversational User Simulation

Paper • 2604.24977 • Published Apr 27 • 8

liked a model about 1 month ago

lllyasviel/ControlNet

Updated Feb 25, 2023 • 2 • 3.82k

upvoted 2 papers about 1 month ago

Elucidating the SNR-t Bias of Diffusion Probabilistic Models

Paper • 2604.16044 • Published Apr 17 • 73

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 242

liked a model about 1 month ago

openbmb/VoxCPM2

Text-to-Speech • 2B • Updated Apr 16 • 238k • 1.36k

liked a model about 2 months ago

tencent/HY-Embodied-0.5

Image-Text-to-Text • 4B • Updated Apr 14 • 854 • 908

Xinyu Zhang

AI & ML interests

Recent Activity

Organizations

gaohaoyu's activity

ProtectBirds