Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 20 days ago • 127
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models Paper • 2406.07594 • Published Jun 11, 2024 • 1
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper • 2510.02190 • Published Oct 2 • 18
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code Paper • 2508.18106 • Published Aug 25 • 346
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems Paper • 2506.16381 • Published Jun 19 • 2
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models Paper • 2503.08120 • Published Mar 11 • 31
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia Paper • 2503.01714 • Published Mar 3 • 5
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published Jan 8 • 53
Flames: Benchmarking Value Alignment of LLMs in Chinese Paper • 2311.06899 • Published Nov 12, 2023 • 2