Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]
Yulei Qin
yolay
AI & ML interests
Medical Imaging, Computer Vision,
Language Models
Recent Activity
updated
a model
about 21 hours ago
yolay/SPEAR-SearchQA-Qwen2.5-14B
updated
a model
2 days ago
yolay/SPEAR-SearchQA-Qwen2.5-7B
updated
a collection
2 days ago
SPEAR