Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2208,7 +2208,7 @@ qwen3.5-fp4-b200-sglang-mtp:
- { tp: 2, ep: 1, conc-start: 4, conc-end: 64, spec-decoding: mtp }

glm5-fp8-b200-sglang:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: zai-org/GLM-5-FP8
model-prefix: glm5
runner: b200
Expand All @@ -2227,7 +2227,7 @@ glm5-fp8-b200-sglang:
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256 }

glm5-fp8-b200-sglang-mtp:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: zai-org/GLM-5-FP8
model-prefix: glm5
runner: b200
Expand Down Expand Up @@ -2307,7 +2307,7 @@ glm5-fp8-b300-sglang-mtp:
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256, spec-decoding: mtp }

glm5-fp4-b200-sglang:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: nvidia/GLM-5-NVFP4
model-prefix: glm5
runner: b200
Expand All @@ -2328,7 +2328,7 @@ glm5-fp4-b200-sglang:
- { tp: 4, ep: 1, conc-start: 4, conc-end: 256 }

glm5-fp4-b200-sglang-mtp:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: nvidia/GLM-5-NVFP4
model-prefix: glm5
runner: b200
Expand Down
24 changes: 24 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3129,3 +3129,27 @@
description:
- "Add --use-chat-template to run_benchmark_serving so prompts are formatted with the Qwen chat template (matching the other Qwen MTP recipes)"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1555

- config-keys:
- glm5-fp4-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

Check warning on line 3155 in perf-changelog.yaml

View check run for this annotation

Claude / Claude Code Review

Stale perf-changelog entries: wrong baseline version and pr-link

The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.1
Comment on lines +3133 to +3155
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.12-cu130 and the pr-link to #1567 so future readers can trace the actual delta and merge commit.

Extended reasoning...

What the bug is

This PR adds four new entries to perf-changelog.yaml (lines 3133-3155), one for each glm5 b200 sglang recipe whose image is being bumped. Each entry contains two pieces of stale information copied from the abandoned precursor PR #1561:

  1. Wrong baseline version in description. All four entries read "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". But the actual previous image — visible in the - lines of the diff at .github/configs/nvidia-master.yaml lines 2210, 2229, 2309, 2330 — was lmsysorg/sglang:v0.5.12-cu130, not v0.5.11. The PR description itself acknowledges this: it says the bump is from lmsysorg/sglang:v0.5.12-cu130, and the Cursor-bot summary embedded in the PR body explicitly calls out that "changelog text references v0.5.11-cu130 as the prior baseline".

  2. Wrong pr-link. All four entries set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. But per the PR description, this PR "Mirrors Update glm-5 container to use SGLang latest #1561 (xinli-sw:glm-update) re-based on current main" — so Update glm-5 container to use SGLang latest #1561 is the abandoned precursor and Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567 is the actual PR that will land. The convention elsewhere in perf-changelog.yaml is that pr-link points to the PR that actually introduces the change (e.g. the immediately preceding entries at lines 3107, 3115, 3125, 3131 link to [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template #1555, [NV] update Minimax2.5 fp8 h100 vllm #1516, etc., matching the merge commits in the recent git log).

Why these are both stale-from-rebase artifacts

Git log confirms commit 8e0f658 (PR #1447) already bumped these four recipes from v0.5.11 to v0.5.12 prior to this PR. So the changelog text "from v0.5.11" was accurate at the time #1561 was first authored, but became stale once #1561 was rebased onto current main (where v0.5.12 was already in place) and resubmitted as #1567. The pr-link similarly carries the original PR number, not the rebase-mirror PR number.

Step-by-step proof

  1. Open the PR diff for .github/configs/nvidia-master.yaml. At line 2210 the removed line is - image: lmsysorg/sglang:v0.5.12-cu130 (same at 2229, 2309, 2330). So the actual prior image is v0.5.12-cu130.
  2. Open the PR diff for perf-changelog.yaml. Lines 3136, 3142, 3148, 3154 all say "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". Compare to step 1: v0.5.11 ≠ v0.5.12.
  3. Lines 3137, 3143, 3149, 3155 all set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. The PR being reviewed is Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567, which per its own description "Mirrors Update glm-5 container to use SGLang latest #1561 ... re-based on current main".
  4. Scanning the immediately preceding changelog entries (lines 3107, 3115, 3125, 3131), pr-links are 1555, 1516, etc., each matching the PR that actually introduced the change (verifiable via git log against commits d4948f9 and 298d8f9).

Impact

Documentation-only. No runtime effect. The cost is purely traceability: anyone reading perf-changelog.yaml in the future to understand the v0.5.12→nightly delta will (a) see the wrong starting version, and (b) follow the pr-link into a closed, abandoned PR rather than the merged commit.

Fix

In the four new entries in perf-changelog.yaml, change:

  • description: v0.5.11-cu130v0.5.12-cu130
  • pr-link: /pull/1561/pull/1567

Loading