Unofficial PyTorch reproduction for Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model.
reinforcement-learning pytorch reproduction ppo llm-reasoning unofficial-implementation open-reasoner-zero
-
Updated
Jun 10, 2026 - Python