Skip to content

[AMD][MI35X] 0526 DSV4#1568

Open
1am9trash wants to merge 9 commits into
mainfrom
dsv4-mi355-sgl-0526
Open

[AMD][MI35X] 0526 DSV4#1568
1am9trash wants to merge 9 commits into
mainfrom
dsv4-mi355-sgl-0526

Conversation

@1am9trash
Copy link
Copy Markdown
Collaborator

@1am9trash 1am9trash commented May 27, 2026

Bump docker image to rocm/sgl-dev:rocm720-mi35x-f96ac98-20260526-DSv4

Successful Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26493895667
Visualize: https://inferencex.semianalysis.com/inference?unofficialRun=26493895667


Note

Low Risk
Benchmark and serving-knob changes only; no production app logic, auth, or data-path changes.

Overview
Updates DeepSeek-V4-Pro FP4 MI355X SGLang (dsv4-fp4-mi355x-sglang) for a newer ROCm image and higher-concurrency stability.

The AMD master config pins rocm/sgl-dev:rocm720-mi35x-f96ac98-20260526-DSv4 (was 8c3b5aa-20260521) and extends DP-attention sweeps so conc-end is 2048 on both 1k/1k and 8k/1k (was 1024 and 512).

dsv4_fp4_mi355x_sglang.sh sets AITER_BF16_FP8_MOE_BOUND=0, and adds --mem-fraction-static 0.90 and --swa-full-tokens-ratio 0.15 on sglang.launch_server to mitigate KV-cache pool exhaustion at high concurrency. perf-changelog.yaml records the change.

Reviewed by Cursor Bugbot for commit 3ad3afe. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

This comment was marked as duplicate.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, straightforward image bump + concurrency widening for the DSv4 FP4 MI355X SGLang benchmark recipe.

Extended reasoning...

Overview

This PR bumps the docker image for the dsv4-fp4-mi355x-sglang benchmark from rocm/sgl-dev:rocm720-mi35x-8c3b5aa-20260521-DSv4 to ...-f96ac98-20260526-DSv4, widens the DP-attention concurrency sweep upper bounds (1024→4096 for 1024/1024, 512→2048 for 8192/1024), and adds --mem-fraction-static 0.90 and --swa-full-tokens-ratio 0.1 to the SGLang launch invocation. Only two files are touched, both benchmark/CI scaffolding.

Security risks

None. This PR touches no production app code, no auth/crypto/permissions, and no data paths. The changes affect only benchmark sweeps executed in CI runners.

Level of scrutiny

Low — this is exactly the kind of mechanical recipe bump that does not require human review. The image tag change is opaque (image content is upstream), the new flags are well-known SGLang server tuning knobs to match the newer image's memory/SWA behavior, and the only operational risk (longer sweeps, more GPU time, possible OOM at the higher conc) is contained to benchmark jobs, not production.

Other factors

No bugs were flagged by the bug-hunting system. The PR description (and Cursor Bugbot's auto-summary) explains the motivation. No outstanding reviewer comments to address — only the standard recipe-reminder bot comments.

@github-actions

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

1 similar comment
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Collaborator

@chunfangamd chunfangamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants