-
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper • 2508.08221 • Published • 49 -
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
RLPR: Extrapolating RLVR to General Domains without Verifiers
Paper • 2506.18254 • Published • 31
Igor Kilbas
kaleinaNyan
AI & ML interests
Computer Vision, NLP
Recent Activity
updated
a collection
5 days ago
Good RL papers
updated
a collection
9 days ago
Good RL papers
liked
a model
4 months ago
openai/gpt-oss-20b
Organizations
None yet