-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 29 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2511.21678
-
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
Paper • 2506.04180 • Published • 33 -
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation
Paper • 2506.10540 • Published • 37 -
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science
Paper • 2506.10974 • Published • 19 -
SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search
Paper • 2507.15245 • Published • 11
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
-
Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Paper • 2511.21691 • Published • 32 -
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Paper • 2511.21678 • Published • 10 -
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction
Paper • 2511.20937 • Published • 15
-
agents-course/notebooks
Updated • 506 -
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper • 2504.01990 • Published • 299 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 128 -
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Paper • 2511.21678 • Published • 10
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 29 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49
-
Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Paper • 2511.21691 • Published • 32 -
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Paper • 2511.21678 • Published • 10 -
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction
Paper • 2511.20937 • Published • 15
-
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
Paper • 2506.04180 • Published • 33 -
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation
Paper • 2506.10540 • Published • 37 -
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science
Paper • 2506.10974 • Published • 19 -
SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search
Paper • 2507.15245 • Published • 11
-
agents-course/notebooks
Updated • 506 -
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper • 2504.01990 • Published • 299 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 128 -
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Paper • 2511.21678 • Published • 10
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100