Skip to content

milad1378yz/MASPRM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MASPRM

Multi-Agent System Process Reward Model

A lightweight process reward model that guides multi-agent reasoning at search time.

Paper · PDF · Project Page

MASPRM training pipeline

MASPRM training pipeline (main paper figure).

Highlights

  • MASPRM adds a process reward model to guide multi-agent sytem.
  • Plugs into MCTS and inference time search for better trajectory selection.
  • Improves exact-match on challenging reasoning benchmarks.

Quickstart

pip install -r requirements.txt
python src/run_mcts.py --dataset mmlu --split train --load_in_4bit --ray --gpus_per_actor 0.125 --actors 32

Docker

docker build -t masprm .
docker run --rm -it -v "$PWD:/app" masprm python src/run_mcts.py --help

BibTeX

@article{yazdani2025masprm,
  title={{MASPRM}: Multi-Agent System Process Reward Model},
  author={Yazdani, Milad and Mostajabdaveh, Mahdi and Zhou, Zirui and Xiong, Ying},
  journal={arXiv preprint arXiv:2510.24803},
  year={2025}
}

About

Multi-Agent System Process Reward Model (MASPRM): a lightweight process reward model guiding multi-agent systems at search time.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors