Update Minimax2.5 H100 by faradawn · Pull Request #484 · vllm-project/recipes

faradawn · 2026-05-21T23:32:32Z

Update the MiniMax-M2.5 recipe to match the new H100 FP8 launch flags validated in SemiAnalysisAI/InferenceX#1516: drop the explicit cudagraph_mode from the compilation-config, switch the reasoning parser to minimax_m2_append_think, and document TEP=8 (tensor-parallel-size 8 with --enable-expert-parallel) as the recommended single-node H100 strategy in place of TP4+EP.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

vercel · 2026-05-21T23:32:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vllm-recipes	Ready	Preview, Comment	May 21, 2026 11:42pm

gemini-code-assist

Code Review

This pull request updates the MiniMax-M2.5 model configuration by adopting the minimax_m2_append_think reasoning parser, removing the cudagraph_mode: PIECEWISE parameter, and increasing the tensor parallel size for H100 configurations. Review feedback highlights several inconsistencies where these updates were not applied to all relevant examples in the guide, specifically the NVIDIA H200 and AMD ROCm sections. Additionally, a correction was suggested for the H100 heading to align with the established naming convention.

gemini-code-assist · 2026-05-21T23:34:24Z

    - "--trust-remote-code"
    - "--compilation-config"
-    - '{"mode":3,"cudagraph_mode":"PIECEWISE","pass_config":{"fuse_minimax_qk_norm":true}}'
+    - '{"mode":3,"pass_config":{"fuse_minimax_qk_norm":true}}'


The removal of cudagraph_mode: PIECEWISE should be applied consistently across all examples in the guide section. While it has been removed here and in the Docker example (line 155), the NVIDIA H200 example at line 179 still includes this parameter.

gemini-code-assist · 2026-05-21T23:34:24Z

    args:
      - "--reasoning-parser"
-      - "minimax_m2"
+      - "minimax_m2_append_think"


The update to the minimax_m2_append_think reasoning parser should be applied consistently across all examples in the guide section. Currently, the NVIDIA H200 example (line 178) and the AMD ROCm example (line 204) still reference the old minimax_m2 parser.

gemini-code-assist · 2026-05-21T23:34:24Z

  Pure TP8 is not supported. For >4 GPUs use DP+EP or TP+EP.

-  ### TP4+EP (recommended for H100)
+  ### TEP=8 (recommended for H100)


The heading TEP=8 is inconsistent with the previous heading style (TP4+EP) and the terminology used in the preceding text (line 184: TP+EP). Consider using TP8+EP to maintain consistency with the rest of the document.

### TP8+EP (recommended for H100)

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

faradawn · 2026-05-21T23:42:41Z

NVFP4 H100 configs from https://github.com/SemiAnalysisAI/InferenceX/pull/1517/changes

esmeetu · 2026-05-22T00:19:10Z

Could you clarify why we need to change the reasoning parser?

faradawn · 2026-05-22T04:47:48Z

I think its to add support for thinking.

Update MiniMax-M2.5 for H100 FP8 (TEP=8)

b741197

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

vercel Bot deployed to Preview May 21, 2026 23:33 View deployment

faradawn changed the title ~~update Minimax2.5 fp8 h100~~ Update Minimax2.5 H100 May 21, 2026

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

add nvfp4 h100

7ac6d15

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

vercel Bot deployed to Preview May 21, 2026 23:42 View deployment

This was referenced May 22, 2026

[NV] update Minimax2.5 fp8 h100 vllm SemiAnalysisAI/InferenceX#1516

Merged

[NV] add minimax fp4 h100 vllm SemiAnalysisAI/InferenceX#1517

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Minimax2.5 H100#484

Update Minimax2.5 H100#484
faradawn wants to merge 2 commits into
vllm-project:mainfrom
faradawn:minimax-m25-h100-fp8

faradawn commented May 21, 2026

Uh oh!

vercel Bot commented May 21, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

faradawn commented May 21, 2026

Uh oh!

esmeetu commented May 22, 2026

Uh oh!

faradawn commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

faradawn commented May 21, 2026

Uh oh!

vercel Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

faradawn commented May 21, 2026

Uh oh!

esmeetu commented May 22, 2026

Uh oh!

faradawn commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 21, 2026 •

edited

Loading