-
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models
Paper • 2601.08955 • Published • 13 -
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines
Paper • 2601.09465 • Published • 41 -
MAXS: Meta-Adaptive Exploration with LLM Agents
Paper • 2601.09259 • Published • 95 -
Toward Efficient Agents: Memory, Tool learning, and Planning
Paper • 2601.14192 • Published • 54
Collections
Discover the best community collections!
Collections including paper arxiv:2602.02361
-
LLM-in-Sandbox Elicits General Agentic Intelligence
Paper • 2601.16206 • Published • 84 -
Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization
Paper • 2601.13118 • Published • 1 -
SWE-Universe: Scale Real-World Verifiable Environments to Millions
Paper • 2602.02361 • Published • 60
-
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
Paper • 2502.05664 • Published • 24 -
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation
Paper • 2312.13010 • Published • 6 -
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
Paper • 2409.16299 • Published • 11 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15
-
SWE-Universe: Scale Real-World Verifiable Environments to Millions
Paper • 2602.02361 • Published • 60 -
LongCodeZip: Compress Long Context for Code Language Models
Paper • 2510.00446 • Published • 107 -
Code2World: A GUI World Model via Renderable Code Generation
Paper • 2602.09856 • Published • 186 -
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Paper • 2601.11868 • Published • 32
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 10 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 12 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 78 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 44
-
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models
Paper • 2601.08955 • Published • 13 -
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines
Paper • 2601.09465 • Published • 41 -
MAXS: Meta-Adaptive Exploration with LLM Agents
Paper • 2601.09259 • Published • 95 -
Toward Efficient Agents: Memory, Tool learning, and Planning
Paper • 2601.14192 • Published • 54
-
SWE-Universe: Scale Real-World Verifiable Environments to Millions
Paper • 2602.02361 • Published • 60 -
LongCodeZip: Compress Long Context for Code Language Models
Paper • 2510.00446 • Published • 107 -
Code2World: A GUI World Model via Renderable Code Generation
Paper • 2602.09856 • Published • 186 -
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Paper • 2601.11868 • Published • 32
-
LLM-in-Sandbox Elicits General Agentic Intelligence
Paper • 2601.16206 • Published • 84 -
Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization
Paper • 2601.13118 • Published • 1 -
SWE-Universe: Scale Real-World Verifiable Environments to Millions
Paper • 2602.02361 • Published • 60
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 10 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 12 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
Paper • 2502.05664 • Published • 24 -
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation
Paper • 2312.13010 • Published • 6 -
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
Paper • 2409.16299 • Published • 11 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 78 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 44