This is the checkpoints and dataset for: From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
AI & ML interests
Large Language Models
Recent Activity
Papers
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It
This is the checkpoints and dataset for: EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL.
-
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 50 -
LARK-Lab/EnvFactory-1.7B
Text Generation • 2B • Updated • 84 -
LARK-Lab/EnvFactory-4B
Text Generation • 4B • Updated • 8 -
LARK-Lab/EnvFactory-8B
Text Generation • 8B • Updated • 9
This is the checkpoints and dataset for: From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
This is the checkpoints and dataset for: EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL.
-
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 50 -
LARK-Lab/EnvFactory-1.7B
Text Generation • 2B • Updated • 84 -
LARK-Lab/EnvFactory-4B
Text Generation • 4B • Updated • 8 -
LARK-Lab/EnvFactory-8B
Text Generation • 8B • Updated • 9