Multimodal LLMs as Customized Reward Models for Text-to-Image Generation Paper • 2507.21391 • Published Jul 28
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Paper • 2512.18880 • Published 6 days ago • 23
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Paper • 2512.18880 • Published 6 days ago • 23
MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models Paper • 2506.23009 • Published Jun 28 • 11
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Paper • 2508.07493 • Published Aug 10 • 8
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Paper • 2508.07493 • Published Aug 10 • 8
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Paper • 2508.07493 • Published Aug 10 • 8 • 2
Towards Visual Text Grounding of Multimodal Large Language Model Paper • 2504.04974 • Published Apr 7 • 17
MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models Paper • 2506.23009 • Published Jun 28 • 11
Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints Paper • 2402.04754 • Published Feb 7, 2024 • 1