[Fix] Remove MoRI-IO patches from vLLM Disagg benchmarks by simondanielsson · Pull Request #1585 · SemiAnalysisAI/InferenceX

simondanielsson · 2026-05-29T07:38:07Z

These patches were upstreamed in vllm-project/vllm#40344 so we can use the nightly image instead.

Switching to nightly also requires us to:

rename a2a backend from mori to mori_low_latency
change MORI_READ_MODE=1 envvar to a read_mode=1 flag.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26813329592

Results from this run are very similar to the existing Kimi vllm disagg results, as expected

Note

Low Risk
Benchmark and container bootstrap only; no application auth or production serving paths, with behavior intended to match prior patched runs via upstream vLLM.

Overview
Moves MI355X vLLM disaggregated Kimi K2.5 (FP4) and MiniMax M2.5 (FP8) benchmarks onto a newer vLLM ROCm nightly that includes upstream MoRI-IO fixes (vllm#40344), so the large runtime Python patches in setup_deps.sh are removed.

Config and launch: amd-master.yaml drops per-scenario VLLM_MORIIO_CONNECTOR_READ_MODE settings; models_vllm.yaml switches MoE all2all from mori to mori_low_latency; prefill/decode/consumer kv-transfer-config now sets read_mode: true in kv_connector_extra_config. job.slurm / submit.sh no longer pass the old read-mode env var; default vllm-router image is bumped. perf-changelog.yaml documents the change.

^{Reviewed by Cursor Bugbot for commit f3b4132. Bugbot is set up for automated code reviews on this repo. Configure here.}

…m image Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

github-actions · 2026-05-29T07:38:18Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

functionstackx · 2026-06-02T23:06:50Z

-  image: vllm/vllm-openai-rocm:nightly-bf610c2f56764e1b30bc6065f4ceace3d6e59036
+  # TODO(simondanielsson): change to pinned version once https://github.com/vllm-project/vllm/pull/40344
+  # is part of official release, likely 0.22.0.
+  image: vllm/vllm-openai-rocm:nightly


can u have an specific night hash pinned here here instead of just the generic "nightly"

We can! The caveat (and why I didn't pin it here) was that the pinned nightlies are pruned from docker hub after 14 days so the config will then cease working in 2 weeks. WDYT?

functionstackx · 2026-06-02T23:06:56Z


 minimaxm2.5-fp8-mi355x-vllm-disagg:
-  image: vllm/vllm-openai-rocm:nightly-a6682d1d259cca69a9ae737ea5608fbbe7520031
+  # TODO(simondanielsson): change to pinned version once https://github.com/vllm-project/vllm/pull/40344


can u have an specific night hash pinned here here instead of just the generic "nightly"

claude · 2026-06-03T08:24:01Z

Claude finished @simondanielsson's task in 2m 25s —— View job

PR Review: Remove MoRI-IO patches from vLLM Disagg benchmarks

Gather context and read changed files
Validate master config changes and perf-changelog
Check for bugs, logic errors, and breaking changes
Post review findings

Summary

Clean removal of ~555 lines of MoRI-IO monkey patches now that they've been upstreamed in vllm#40344. The three changes (backend rename mori → mori_low_latency, env var → read_mode config flag, patch removal) are consistent and the CI run confirms they work.

One blocking issue: perf-changelog.yaml must be updated since amd-master.yaml was modified — see inline comment.

The unpinned nightly image tags are already under discussion with the reviewer.

simondanielsson · 2026-06-03T08:24:14Z

Run is succeeding, and perf is very similar to the existing Kimi vllm-disagg results (with some noise of course)

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

…atches

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 3c94a6f. Configure here.}

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

fix: remove moriio connector patches after bumping to new nightly vll…

89b9243

…m image Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

github-project-automation Bot added this to InferenceMAX Board May 29, 2026

arygupt mentioned this pull request May 29, 2026

fix(process_result): fail loudly on zero-throughput disagg runs (no more masked ZeroDivisionError) #1590

Open

arygupt force-pushed the fix/remove-vllm-disagg-patches branch from 908939c to 89b9243 Compare May 30, 2026 02:47

simondanielsson added 3 commits June 2, 2026 09:40

fix: move read mode envvar to flag

d2aadee

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

fix: unpin vllm-router image

24555d3

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

fix: update to mori_low_latency backend after rename in nightly

dfdbc7d

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

functionstackx reviewed Jun 2, 2026

View reviewed changes

Comment thread benchmarks/multi_node/amd_utils/setup_deps.sh

simondanielsson marked this pull request as ready for review June 3, 2026 08:23

simondanielsson requested a review from a team June 3, 2026 08:23

simondanielsson requested review from 1am9trash, billishyahao, chunfangamd, seungrokj and yctseng0211 as code owners June 3, 2026 08:23

claude Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread .github/configs/amd-master.yaml

claude Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread benchmarks/multi_node/amd_utils/setup_deps.sh

Comment thread .github/configs/amd-master.yaml

simondanielsson added 3 commits June 4, 2026 09:00

fix: pin nightlyies

0c16e44

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

fix: comments and add perf-changelog.yml

3c94a6f

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Merge remote-tracking branch 'upstream' into fix/remove-vllm-disagg-p…

3412624

…atches

cursor Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread benchmarks/multi_node/amd_utils/job.slurm Outdated

fix: pint router iamge as well

f3b4132

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

simondanielsson requested a review from functionstackx June 4, 2026 07:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Remove MoRI-IO patches from vLLM Disagg benchmarks #1585

[Fix] Remove MoRI-IO patches from vLLM Disagg benchmarks #1585
simondanielsson wants to merge 8 commits into
mainfrom
fix/remove-vllm-disagg-patches

simondanielsson commented May 29, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

functionstackx Jun 2, 2026

Uh oh!

simondanielsson Jun 3, 2026 •

edited

Loading

Uh oh!

functionstackx Jun 2, 2026

Uh oh!

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

simondanielsson commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simondanielsson commented May 29, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

functionstackx Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

simondanielsson Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

functionstackx Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Remove MoRI-IO patches from vLLM Disagg benchmarks

Summary

Uh oh!

simondanielsson commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simondanielsson commented May 29, 2026 •

edited by cursor Bot

Loading

simondanielsson Jun 3, 2026 •

edited

Loading

claude Bot commented Jun 3, 2026 •

edited

Loading