Zhao's picture

15 138

Zhao

Hanyu66

·

ZZHanyu

AI & ML interests

CV, NLP

Recent Activity

liked a model 13 days ago

InstantX/Qwen-Image-ControlNet-Union

updated a collection 26 days ago

liked a model 29 days ago

jay-jnp/F-ViTA_KAIST

View all activity

Organizations

None yet

upvoted 5 papers about 1 month ago

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

Paper • 2511.04962 • Published Nov 7 • 52

DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7 • 42

Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings

Paper • 2511.05017 • Published Nov 7 • 7

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7 • 50

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

Paper • 2510.25897 • Published Oct 29 • 16

upvoted a paper about 2 months ago

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Paper • 2510.23607 • Published Oct 27 • 175

upvoted a collection about 1 year ago

Papers I've read

16 items • Updated Jan 12 • 6

upvoted a paper about 1 year ago

Large Language Models Cannot Self-Correct Reasoning Yet

Paper • 2310.01798 • Published Oct 3, 2023 • 36

upvoted a collection about 1 year ago

MoEs papers reading list

60 items • Updated Nov 4, 2024 • 145

upvoted 6 papers over 1 year ago

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Paper • 2308.13437 • Published Aug 25, 2023 • 4

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Paper • 2308.12966 • Published Aug 24, 2023 • 11

Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

Paper • 2309.15915 • Published Sep 27, 2023 • 2

Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants

Paper • 2310.00653 • Published Oct 1, 2023 • 3

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 19

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 39