MLLM - a hassenhamdi Collection

hassenhamdi 's Collections

Vision

Coding

MLLM

Video generation

Agents

Algorithmic Evolution

LLM

Papers

RAG

MLLM

updated 25 days ago

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published Jan 5 • 62
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Paper • 2601.14724 • Published 27 days ago • 74
VIOLA: Towards Video In-Context Learning with Minimal Annotations

Paper • 2601.15549 • Published 27 days ago • 4