Description
I am trying to run Agent-R1 on HiPerGator using the B200 GPU partition (hpg-b200). The new B200 GPUs require PyTorch ≥2.7.0 with CUDA 12.8 for sm_100 architecture support. However, Verl and vLLM dependencies were originally pinned for Torch 2.3.0, which creates conflicts.
Steps to Reproduce
- Create a clean conda env with Python 3.10.
- Install Verl (0.2.0.dev0) as per project instructions.
- Install PyTorch stack for B200:
pip install torch==2.7.0+cu128 torchvision==0.22.0+cu128 torchaudio==2.7.0+cu128 --index-url https://download.pytorch.org/whl/cu128
- Try installing tensordict, xformers, and vllm.
Actual Results
- verl 0.2.0.dev0 requires tensordict<=0.6.2.
- torch==2.7.0 requires tensordict>=0.7.0.
- vllm 0.4.x (used in Agent-R1) only works with torch==2.3.0.
- vllm >=0.10.x works with Torch 2.7.0, but introduces breaking API changes (model_hf_config removed).
Example error
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.
verl 0.2.0.dev0 requires tensordict<=0.6.2, which is not installed.
torchaudio 2.7.0+cu128 requires torch==2.7.0, but you have torch 2.7.1 which is incompatible.
torchvision 0.22.0+cu128 requires torch==2.7.0, but you have torch 2.7.1 which is incompatible.
Expected Results
Ability to install and run Agent-R1/Verl with PyTorch 2.7.0 + CUDA 12.8 (for B200 GPUs) without dependency conflicts.
CC
@lyumengxian
Description
I am trying to run Agent-R1 on HiPerGator using the B200 GPU partition (hpg-b200). The new B200 GPUs require PyTorch ≥2.7.0 with CUDA 12.8 for sm_100 architecture support. However, Verl and vLLM dependencies were originally pinned for Torch 2.3.0, which creates conflicts.
Steps to Reproduce
Actual Results
Example error
Expected Results
Ability to install and run Agent-R1/Verl with PyTorch 2.7.0 + CUDA 12.8 (for B200 GPUs) without dependency conflicts.
CC
@lyumengxian