Skip to content

[Fix] Remove MoRI-IO patches from vLLM Disagg benchmarks #1585

Open
simondanielsson wants to merge 8 commits into
mainfrom
fix/remove-vllm-disagg-patches
Open

[Fix] Remove MoRI-IO patches from vLLM Disagg benchmarks #1585
simondanielsson wants to merge 8 commits into
mainfrom
fix/remove-vllm-disagg-patches

Conversation

@simondanielsson

@simondanielsson simondanielsson commented May 29, 2026

Copy link
Copy Markdown
Collaborator

These patches were upstreamed in vllm-project/vllm#40344 so we can use the nightly image instead.

Switching to nightly also requires us to:

  • rename a2a backend from mori to mori_low_latency
  • change MORI_READ_MODE=1 envvar to a read_mode=1 flag.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26813329592

Results from this run are very similar to the existing Kimi vllm disagg results, as expected


Note

Low Risk
Benchmark and container bootstrap only; no application auth or production serving paths, with behavior intended to match prior patched runs via upstream vLLM.

Overview
Moves MI355X vLLM disaggregated Kimi K2.5 (FP4) and MiniMax M2.5 (FP8) benchmarks onto a newer vLLM ROCm nightly that includes upstream MoRI-IO fixes (vllm#40344), so the large runtime Python patches in setup_deps.sh are removed.

Config and launch: amd-master.yaml drops per-scenario VLLM_MORIIO_CONNECTOR_READ_MODE settings; models_vllm.yaml switches MoE all2all from mori to mori_low_latency; prefill/decode/consumer kv-transfer-config now sets read_mode: true in kv_connector_extra_config. job.slurm / submit.sh no longer pass the old read-mode env var; default vllm-router image is bumped. perf-changelog.yaml documents the change.

Reviewed by Cursor Bugbot for commit f3b4132. Bugbot is set up for automated code reviews on this repo. Configure here.

…m image

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Comment thread .github/configs/amd-master.yaml Outdated
image: vllm/vllm-openai-rocm:nightly-bf610c2f56764e1b30bc6065f4ceace3d6e59036
# TODO(simondanielsson): change to pinned version once https://github.com/vllm-project/vllm/pull/40344
# is part of official release, likely 0.22.0.
image: vllm/vllm-openai-rocm:nightly

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u have an specific night hash pinned here here instead of just the generic "nightly"

@simondanielsson simondanielsson Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can! The caveat (and why I didn't pin it here) was that the pinned nightlies are pruned from docker hub after 14 days so the config will then cease working in 2 weeks. WDYT?


minimaxm2.5-fp8-mi355x-vllm-disagg:
image: vllm/vllm-openai-rocm:nightly-a6682d1d259cca69a9ae737ea5608fbbe7520031
# TODO(simondanielsson): change to pinned version once https://github.com/vllm-project/vllm/pull/40344

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u have an specific night hash pinned here here instead of just the generic "nightly"

Comment thread benchmarks/multi_node/amd_utils/setup_deps.sh
@simondanielsson simondanielsson marked this pull request as ready for review June 3, 2026 08:23
@simondanielsson simondanielsson requested a review from a team June 3, 2026 08:23
@claude

claude Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Claude finished @simondanielsson's task in 2m 25s —— View job


PR Review: Remove MoRI-IO patches from vLLM Disagg benchmarks

  • Gather context and read changed files
  • Validate master config changes and perf-changelog
  • Check for bugs, logic errors, and breaking changes
  • Post review findings

Summary

Clean removal of ~555 lines of MoRI-IO monkey patches now that they've been upstreamed in vllm#40344. The three changes (backend rename morimori_low_latency, env var → read_mode config flag, patch removal) are consistent and the CI run confirms they work.

One blocking issue: perf-changelog.yaml must be updated since amd-master.yaml was modified — see inline comment.

The unpinned nightly image tags are already under discussion with the reviewer.

@simondanielsson

Copy link
Copy Markdown
Collaborator Author

Run is succeeding, and perf is very similar to the existing Kimi vllm-disagg results (with some noise of course)

Comment thread .github/configs/amd-master.yaml
Comment thread benchmarks/multi_node/amd_utils/setup_deps.sh
Comment thread .github/configs/amd-master.yaml
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3c94a6f. Configure here.

Comment thread benchmarks/multi_node/amd_utils/job.slurm Outdated
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants