Collections
Discover the best community collections!
Collections including paper arxiv:2602.03143
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 24 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 32
-
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 75 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 107 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 507 -
Multi-Agent Tool-Integrated Policy Optimization
Paper • 2510.04678 • Published • 31
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 9 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 40 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 29 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 53
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 103 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 92 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 105 -
How Far Are We from Genuinely Useful Deep Research Agents?
Paper • 2512.01948 • Published • 56
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
facebook/w2v-bert-2.0
Feature Extraction • 0.6B • Updated • 3.32M • 204 -
facebook/metaclip-h14-fullcc2.5b
Zero-Shot Image Classification • 1.0B • Updated • 21.6k • 49 -
openai/clip-vit-large-patch14
Zero-Shot Image Classification • 0.4B • Updated • 7.57M • 1.96k -
Salesforce/blip-image-captioning-large
Image-to-Text • 0.5B • Updated • 723k • 1.45k
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 9 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 40 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 29 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 53
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 24 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 32
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 103 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 92 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 105 -
How Far Are We from Genuinely Useful Deep Research Agents?
Paper • 2512.01948 • Published • 56
-
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 75 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 107 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 507 -
Multi-Agent Tool-Integrated Policy Optimization
Paper • 2510.04678 • Published • 31
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
facebook/w2v-bert-2.0
Feature Extraction • 0.6B • Updated • 3.32M • 204 -
facebook/metaclip-h14-fullcc2.5b
Zero-Shot Image Classification • 1.0B • Updated • 21.6k • 49 -
openai/clip-vit-large-patch14
Zero-Shot Image Classification • 0.4B • Updated • 7.57M • 1.96k -
Salesforce/blip-image-captioning-large
Image-to-Text • 0.5B • Updated • 723k • 1.45k