-
Notifications
You must be signed in to change notification settings - Fork 176
[NV] Update B300 DSV4 SGLang Pareto sweep #1575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3171,3 +3171,11 @@ | |
| description: | ||
| - "Validates measured-power aggregation pipeline (PR #1558) on both NVIDIA (H200) and AMD (MI355X) hardware — different SMI tools (nvidia-smi vs amd-smi), different CSV schemas (power.draw [W] vs socket_power), same aggregator. No config change. Entry intentionally kept past merge so run-sweep produces canonical agg JSONs with avg_power_w + joules_per_output_token on main for both vendors, seeding the dashboard's day-zero data." | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1558 | ||
|
|
||
| - config-keys: | ||
| - dsv4-fp4-b300-sglang | ||
| description: | ||
| - "Update DeepSeek-V4-Pro FP4 B300 SGLang non-MTP sweep to the 2026-05-19 8k/1k submission frontier: TP8 no-DP-attention c1-c64 and DEP8 DP-attention c512/c768/c1024/c1536/c2048" | ||
| - "Use lmsysorg/sglang:nightly-dev-cu13-20260522-7cf193fe to pick up the merged SGLang warmup path" | ||
| - "Map dp-attn=false to TP8 flashinfer_mxfp4 with chunked-prefill 8192; map dp-attn=true to DEP8 mixed-chunk MegaMoE throughput settings" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1552 | ||
|
Check failure on line 3181 in perf-changelog.yaml
|
||
|
Comment on lines
+3175
to
+3181
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 The new Extended reasoning...What the bug isThe new entry appended at pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1552…but the PR being merged is #1575 (the rebased successor — the PR description itself notes "Rebased copy of #1552 with Why this is a real issue, not just cosmeticEvery other recent entry in
Step-by-step proof
FixUpdate line 3181 to: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1575or, equivalently, use the pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXThe historical reference to #1552 (which the PR description already provides) can stay in the PR description; the changelog entry should point at the PR that actually lands the change, both for convention and for tooling correctness. |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single-node
full-sweepcrashes onconc-listconfigsMedium Severity
The new
dsv4-fp4-b300-sglangconfig usesconc-listin single-nodefixed-seq-lensearch-space entries, but thegenerate_full_sweep()function's single-node code path unconditionally accessesconc-startandconc-endwithout first checking forconc-list. Runningfull-sweepover the nvidia master config will crash with aKeyError. Thetest-configpath (used by the PR's CI andprocess_changelog.py) handlesconc-listcorrectly, which is why the PR's own tests pass — but thefull-sweepcommand (documented in the README and available viae2e-tests.yml) is broken for this config.Reviewed by Cursor Bugbot for commit deee4cc. Configure here.