Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training Paper • 2509.03403 • Published Sep 3 • 22
RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning Paper • 2503.12759 • Published Mar 17 • 1