Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 57
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment Paper • 2505.11821 • Published May 17 • 14