Hi, thanks for releasing PRO-V and the PRO-V-R1-8B model.
I was trying to reproduce the full PRO-V system described in the paper/README, especially the Judge, majority voting, Testbench Edit Tool, and iterative verifier/refinement loop. After reading the current main branch, it seems that the released evaluation code is a simplified pipeline:
prompting_top_agent_simple.py only runs GenTBAgent -> PyCheckerAgent -> validation.
prompting_top_agent_ray.py initializes GenTBAgent and PyCheckerAgent, but I could not find an initialized VerifierAgent.
- The verification branch calls
self.verifier_agent.run(...), but I could not find the corresponding verifier agent implementation.
- The code comments in the MODIFY_PYCHECKER / MODIFY_TESTBENCH branches mention that a full implementation would pass feedback or directly edit
testbench.json, but the current code appears to regenerate instead.
- Multiple PyChecker samples are generated, but the current selection seems to use the first successful sample rather than a majority-vote/Judge-based selection.
Could you clarify whether the full Judge / verifier / Testbench Edit Tool / majority-vote implementation is planned to be released? If it is available in another branch or repository, could you point me to it?
This would be very helpful for reproducing the full PRO-V system rather than only the simplified evaluation pipeline.
Hi, thanks for releasing PRO-V and the PRO-V-R1-8B model.
I was trying to reproduce the full PRO-V system described in the paper/README, especially the Judge, majority voting, Testbench Edit Tool, and iterative verifier/refinement loop. After reading the current
mainbranch, it seems that the released evaluation code is a simplified pipeline:prompting_top_agent_simple.pyonly runs GenTBAgent -> PyCheckerAgent -> validation.prompting_top_agent_ray.pyinitializes GenTBAgent and PyCheckerAgent, but I could not find an initialized VerifierAgent.self.verifier_agent.run(...), but I could not find the corresponding verifier agent implementation.testbench.json, but the current code appears to regenerate instead.Could you clarify whether the full Judge / verifier / Testbench Edit Tool / majority-vote implementation is planned to be released? If it is available in another branch or repository, could you point me to it?
This would be very helpful for reproducing the full PRO-V system rather than only the simplified evaluation pipeline.