[AI Generated] [Handoff] out of 70+ image updates, 13 stuck Klaud Cold PRs need upstream coordination / scope decisions

# human

handing off to @Oseltamivir to try to debug this, my /loop failed on this and didnt get the chance to manually look at it

# below is [AI Generated]

# Handoff to @Oseltamivir — 13 stuck Klaud Cold PRs

These 13 PRs all have a real diagnosis and (where applicable) an applied workaround, but they're blocked on upstream fixes, infra outages, or judgment calls outside what `/loop` should keep retrying. Handing them off so they don't keep churning sweep capacity.

PRs grouped by category:

| Category | Count | PRs |
| --- | --- | --- |
| DSV4 — needs custom image | 4 | ~~#1461~~, #1460, #1455, #1450 |
| Upstream bugs with ticket filed | 4 | #1494, #1441 (AMD → sgl#25742); ~~#1451 → sgl#25863~~; #1420 → sgl#25563 |
| MI300 cluster down (firmware upgrades) | 3 | #1403, #1499, #1482 |
| Other | 2 | ~~#1512, #1521~~ done |

And the same set grouped by vendor (NVIDIA vs AMD):

| Vendor | Category | Count | PRs |
| --- | --- | --- | --- |
| **NVIDIA** (H200/B200/B300) — 7 total | DSV4 — latest stable image didnt work | 4 | #1461 (H200), #1460 (H200), #1455 (B300), #1450 (B200) |
|  | Upstream bugs with ticket filed | 2 | #1451 (B300 → sgl#25863), #1420 (B300 → sgl#25563) |
|  | Other | 1 | #1512 (B300, sgl-deep-gemm pin test for sgl#25551) |
| **AMD** (MI300X/MI355X) — 6 total | Upstream bugs with ticket filed | 2 | #1494 (MI355X), #1441 (MI355X) — both AMD-acknowledged via sgl#25742 |
|  | MI300 cluster down (firmware upgrades) | 3 | #1403 (MI300X), #1499 (MI300X), #1482 (MI300X) |
|  | Other | 1 | #1521 (MI355X, eval-only flake) |

For each PR below: links to the PR + most recent failing sweep run, what's wrong, what's been tried, and the upstream tickets (if filed).

---

## DSV4 — latest stable image didnt work (4 PRs)

**Category summary:** Every DSV4 PR is blocked on the generic upstream image lacking what DSV4-Pro needs:
- **Transformers patch** (`KeyError: 'deepseek_v4'`) for the model_type registration — affects #1455, #1450.
- **Per-GPU VRAM footprint** the generic image can't fit (~125–141 GB per H200) — affects #1460.
- **vLLM CUDA-graph profiler over-reservation** on H200 with v0.21.0 — affects #1461 (the only one that might still be fixable in-recipe via an env var).

Recommendation: keep DSV4 pinned to its SHA-pinned custom images (`deepseek-v4-blackwell@sha256:...`, `deepseek-v4-b300@sha256:...`, `deepseek-v4-hopper`); close the generic-bump PRs (#1460, #1455, #1450). #1461 is worth one more env-var attempt before closing.

### #1461 — dsv4-fp8-h200-vllm (+mtp) → v0.21.0

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1461
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26009323734
- **Diagnosis:** vLLM v0.21.0 CUDA-graph memory profiler still over-reserves VRAM even after dropping `--gpu-memory-utilization` to `0.90` (already pushed). Sweep finished with `FAILURE=77 SKIPPED=7 SUCCESS=15` — almost entirely failing.
- **Tried (didn't fix):** Lowered `--gpu-memory-utilization` from default → `0.90`. Did not unblock.
- **Proposed next:** Add `export VLLM_MEMORY_PROFILER_ESTIMATE_CUDAGRAPHS=0` before `vllm serve` in `dsv4_fp8_h200{,_mtp}.sh`. Worth one more attempt before escalating to vLLM upstream.
- **Upstream:** None filed yet.
- **Escalation:** If you try the `VLLM_MEMORY_PROFILER_ESTIMATE_CUDAGRAPHS=0` workaround (or any other debug attempt) and it still fails, please ping @ywang96 — DSV4 + vLLM v0.21.0 on H200 is in their wheelhouse.

### #1460 — dsv4-fp8-h200-sglang (+mtp) → v0.5.12-cu130

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1460
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26006219979
- **Diagnosis:** DSV4-Pro FP8 + MTP weights take ~125 GB / 141 GB per H200 on generic `v0.5.12-cu130`. The custom `deepseek-v4-hopper` image uses a different EAGLE / weight layout that fits.
- **Tried (didn't fix):** Image bump alone; no flag toggle would close a ~16 GB/H200 gap.
- **Recommendation:** Keep DSV4 pinned to SHA-pinned custom image; the generic-bump is not viable on H200. Likely **close** this PR.
- **Upstream:** None — this is an sglang-image packaging decision, not a bug.

### #1455 — dsv4-fp4-b300-sglang (+mtp) → v0.5.12-cu130

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1455
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26006192244
- **Diagnosis:** **Not OOM, not a B300 kernel regression.** Server starts cleanly (`"fired up and ready to roll!"`); the bench client crashes in `benchmark_serving.py` calling `AutoTokenizer.from_pretrained("/data/models/dsv4-pro")` with `KeyError: 'deepseek_v4'` — the generic `v0.5.12-cu130` image's transformers doesn't know that `model_type`. The custom `deepseek-v4-b300@sha256:...` image bundles a patched transformers.
- **Tried (didn't fix):** Image bump only. Post-cluster-fix rebase (head `7e3166ec`) also fails (`FAILURE=2 IN_PROGRESS=2 QUEUED=25 SKIPPED=23 SUCCESS=7`) — confirming the diagnosis isn't about cluster availability.
- **Recommendation:** Keep DSV4 pinned to custom `deepseek-v4-b300@sha256:...` image until sglang ships an image that bundles transformers with `deepseek_v4` support. Likely **close**. Same conclusion as #1460 / #1450.
- **Upstream:** Could file against sglang asking them to land deepseek_v4 in the generic image's pinned transformers — but worth checking with Bryan first.

### #1450 — dsv4-fp4-b200-sglang → v0.5.12-cu130

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1450
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26006157172
- **Diagnosis:** Almost certainly same root cause as #1455 — generic `v0.5.12-cu130` bundles a transformers that doesn't recognize `model_type: "deepseek_v4"`, bench client crashes in `AutoTokenizer.from_pretrained` with `KeyError: 'deepseek_v4'`. The custom `deepseek-v4-blackwell@sha256` image presumably bundles a patched transformers.
- **Tried (didn't fix):** Image bump only.
- **Recommendation:** Keep DSV4 pinned to `lmsysorg/sglang:deepseek-v4-blackwell@sha256:...`. Same handling as #1460 / #1455 — likely **close**.
- **Upstream:** Same potential ask as #1455.

---

## Upstream bugs with ticket filed (4 PRs)

**Category summary:** Each of these has a known upstream issue blocking the recipe. AMD (sgl#25742) acknowledged the GLM-5.1-MXFP4 GSM8K regression. The two B300/sglang-v0.5.12 ones (sgl#25563, sgl#25863) are filed and awaiting triage. Don't keep retrying these — wait for upstream.

### #1494 — Add glm5.1-fp4-mi355x-sglang-mtp recipe — sgl#25742 (AMD)

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1494
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26018162860
- **Diagnosis:** **Quality regression, not a crash.** 1 eval-only job failed: `glm5.1 fp4 mi355x sglang tp=2 spec-mtp conc-256 eval-only`. Server warmed up, lm-eval ran gsm8k to completion, but accuracy was `exact_match = 0.1774 / 0.1782` against the `0.85` threshold. EAGLE+MTP draft on GLM-5.1 MXFP4 is producing degenerate output on math reasoning — likely the draft model isn't aligned for chain-of-thought, or the new recipe's speculative knobs need tuning (`--speculative-num-steps=3`, `--speculative-num-draft-tokens=4`). Perf-bench jobs were still in flight at handoff time.
- **Tried (didn't fix):** None — fix needs a judgment call on which option below to take.
- **Options:**
  - (a) Drop the eval-only entry from the recipe (let perf bench validate; skip the gsm8k accuracy gate for the MTP variant).
  - (b) Tune `--speculative-num-steps` / `--speculative-eagle-topk` down.
  - (c) Lower the gsm8k threshold for this recipe in `utils/evals/thresholds.json`.
  - (d) Wait for the perf-bench jobs to finish — if those pass, merge with eval gate removed.
- **Recommendation:** (a) or (d). Tuning EAGLE knobs (b) without a real perf-quality study is just guessing, and dropping the threshold (c) silently hides the regression.
- **Upstream:** https://github.com/sgl-project/sglang/issues/25742 (filed) — asking AMD/sglang whether the ~1.8× accuracy regression vs OFF is a known limitation of the bundled EAGLE head.

### #1441 — Update glm5.1-fp4-mi355x-sglang SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517 — sgl#25742 (AMD)

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1441
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26046393497
- **Diagnosis:** **Same GLM-5.1-MXFP4 gsm8k quality issue as #1494, but on the OFF variant (no MTP).** 2 eval-only jobs failed; server warmed up, lm-eval ran gsm8k to completion (~46 min), `exact_match = 0.3177 (< 0.85 threshold)`. Better than #1494's 0.18 (no draft model degrading output) but still ~3x below the gate. **The GLM-5.1-MXFP4 model itself doesn't pass GSM8K at fp4 on mi355x — not an image-bump regression.** 27 perf-bench jobs succeeded; only eval-only is failing.
- **Tried (didn't fix):** Nothing pushed — needs a judgment call on which fix option below.
- **Options (same as #1494):**
  - (a) Drop eval-only entries for both glm5.1-fp4-mi355x recipes (off + mtp).
  - (b) Lower the gsm8k threshold in `utils/evals/thresholds.json` for these recipes (e.g. to 0.30 for off, 0.15 for mtp).
  - (c) Merge as-is if 27/29 perf-bench passing is acceptable.
- **Recommendation:** (a) — both glm5.1-fp4-mi355x recipes are gating on a metric the model can't meet at this precision; gate is providing zero signal beyond "model is fp4-quantized". Pair with the #1494 decision since they're the same root cause.
- **Upstream:** https://github.com/sgl-project/sglang/issues/25742 (filed) — asking AMD/sglang for an expected GSM8K reference for GLM-5.1-MXFP4 on this image.

### #1451 — qwen3.5-fp8-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25863

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1451
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26144042784/job/76895376231
- **Diagnosis:** Post-cluster-fix rebase clears the vision-encoder cute crash via `--mm-attention-backend triton_attn` workaround, but now exposes a **silent GSM8K quality regression**: `exact_match=0.0000 (strict-match) / 0.0015 (flexible-extract)` against a `0.85` threshold on the `tp=4 8k1k spec-none conc-256 eval-only` matrix entry. Server starts cleanly; requests succeed; model just isn't producing GSM8K-formatted answers. Likely interaction between `--quantization fp8`, `--moe-runner-backend flashinfer_trtllm`, `--attention-backend trtllm_mha`, and chat-template handling in v0.5.12-cu130.
- **Tried (didn't fix):** Rebased onto current main (head `e1d3a181`). `--mm-attention-backend triton_attn` workaround in place (for the unrelated sgl#25564 cute crash); doesn't fix the quality regression.
- **Upstream:** https://github.com/sgl-project/sglang/issues/25863 (filed today) — including server-args, eval command, and per-sample artifact pointer.

### #1420 — glm5-fp4-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25563

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1420
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26144061626/job/76895442359
- **Diagnosis:** Upstream sglang v0.5.12 `trtllm-batched-gemm` bug — EAGLE draft CUDA-graph capture crashes at `bs=128` (`numBatches=256`, `GemmMNK 128x1024x6144`, kernel `...sm100f`) on B300 for GLM-5-NVFP4. Per @trevor-m on sgl#25563, likely a flashinfer regression (`flashinfer_python` 0.6.8.post1 → 0.6.11.post1 bump between v0.5.11 and v0.5.12).
- **Tried (currently running):** Pushed `cfaf3bdd` pinning `flashinfer_python==0.6.8.post1` + `flashinfer_cubin==0.6.8.post1` in `glm5_fp4_b300{,_mtp}.sh`; rebased onto current main (head `006a3908`). Post-rebase sweep shows `FAILURE=1 IN_PROGRESS=8 QUEUED=1 SKIPPED=6 SUCCESS=33` — most jobs pass with the pin, but one 8k1k MTP job still fails. Worth checking whether the remaining FAILURE is the same trtllm-batched-gemm site or something new.
- **Upstream:** https://github.com/sgl-project/sglang/issues/25563 (active discussion with @trevor-m).

---

## MI300 cluster down — waiting for firmware upgrades (3 PRs)

**Category summary:** MI300 cluster is in a firmware-upgrade window; sweep retries that hit `mi300x-amds_*` nodes get cancelled or can't allocate. No code change needed — these PRs just need a rerun once the upgrade window ends. The high cancellation counts (e.g. `CANCELLED=10–13`) are infra, not recipe regressions.

### #1499 — Add dsr1-fp8-mi300x-sglang-mtp recipe

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1499
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26048211958
- **Diagnosis:** Sweep breakdown `CANCELLED=10 FAILURE=3 SKIPPED=6 SUCCESS=12` on the mi300x pool — mostly cancellations consistent with the firmware-upgrade outage. The successful 12 jobs prove the recipe itself works on the available nodes.
- **Tried (didn't fix):** Nothing — reruns will keep getting cancelled until the cluster is back.
- **Proposed fix:** Wait for the firmware-upgrade window to end, then `gh run rerun --failed` on the latest sweep run.
- **Upstream:** None — infra schedule.

### #1482 — Add qwen3.5-fp8-mi300x-sglang-mtp recipe

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1482
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26016446435
- **Diagnosis:** Sweep breakdown `CANCELLED=13 SKIPPED=6 SUCCESS=12` — all 13 non-skipped failures are cancellations on mi300x nodes; zero hard FAILUREs. Same firmware-upgrade outage as #1499.
- **Tried (didn't fix):** Nothing — same infra root cause.
- **Proposed fix:** Wait for firmware-upgrade window to end, then `gh run rerun --failed`.
- **Upstream:** None — infra schedule.

### #1403 — Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1403
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26008643806
- **Diagnosis:** **Originally diagnosed as a transient SLURM controller flake; now re-categorized as MI300 cluster downtime for firmware upgrades.** Single matrix job (`single-node 8k1k spec-none conc-X`) timed out after ~5h waiting for an allocation on `mi300x-amds_01`. The salloc log shows `_accept_msg_connection[167.94.146.58:63632]: Connection reset by peer; Job submit/allocate failed`. The other 42 successes prove the image bump itself is fine.
- **Tried (didn't fix):** Nothing — reruns will keep hitting the same infra gap until the upgrade completes.
- **Proposed fix:** Wait for the firmware-upgrade window to end, then `gh run rerun 26008643806 --failed` — the cluster will allocate and the PR will go green.
- **Upstream:** None — infra schedule, not a bug.

---

## Other (2 PRs)

### #1512 — Test `sgl-deep-gemm==0.0.1` pin for sgl#25551 (glm5-fp8-b300 DeepGemm regression)

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1512
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26144077664/job/76895488937
- **What this PR is:** A debug-only test of @trevor-m's suggestion in [sgl-project/sglang#25551 (comment)](https://github.com/sgl-project/sglang/issues/25551#issuecomment-4481466979) — pin `sgl-deep-gemm==0.0.1` inside the v0.5.12 container (re-enable JIT DeepGemm) to check whether the deep-gemm `0.0.1 → 0.1.0` upgrade is what triggers the B300 `CUDA_ERROR_ILLEGAL_ADDRESS` TMA-descriptor regression. **Not meant to merge.**
- **Diagnosis:** **First attempt was invalid — the pin never applied.** Both `pip install --no-deps` lines in `glm5_fp8_b300.sh` got blocked by Debian PEP 668 (`error: externally-managed-environment`) inside `lmsysorg/sglang:v0.5.12-cu130`. Pushed `f24746e5` adding `--break-system-packages` so the pin actually takes effect; awaiting fresh sweep result.
- **Tried (didn't fix):** Initial 0.0.1 pin via `pip install --no-deps` — blocked by PEP 668. Now fixed with `--break-system-packages`.
- **Upstream:** https://github.com/sgl-project/sglang/issues/25551 — comment posted explaining the invalid first run; will follow up once the fixed sweep produces a real result.

### #1521 — Add dsr1-fp8-mi355x-sglang-mtp single-node MTP recipe

- PR: https://github.com/SemiAnalysisAI/InferenceX/pull/1521
- Failing run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/26134317173/job/76893483331
- **Diagnosis:** Sweep breakdown `CANCELLED=1 FAILURE=1 SKIPPED=24 SUCCESS=23` — a single eval-only job died early (`srun: error: mia1-p01-g31: task 0: Exited with exit code 1`, no `results*.json` produced). Server fully started (cuda-graph capture completed, mem ~88 GB free); failure is in the eval shell stage, not the model. Likely a flake or a missing eval-tool dep on that one node — needs investigation before judging the recipe.
- **Tried (didn't fix):** Nothing — needs a closer look at the failing eval job's pre-run shell output.
- **Proposed next:** Rerun the failed job (`gh run rerun 26134317173 --failed`) and if it reproduces, dig into the eval shell stage for that node.
- **Upstream:** None — likely flake/env issue, not a bug.

---

The `/loop` skill keeps refreshing the dashboard and applying surface-level workarounds, but none of these are productive to keep retrying without either (a) an upstream fix landing, (b) the MI300 cluster coming back, or (c) a human deciding scope (close vs keep open, change strategy).

Each affected PR's title has been prefixed with `[Handoff to @Oseltamivir Claude /loop]` so they're easy to find in the PR list and the dashboard won't keep re-diagnosing them.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AI Generated] [Handoff] out of 70+ image updates, 13 stuck Klaud Cold PRs need upstream coordination / scope decisions #1511

human

below is [AI Generated]

Handoff to @Oseltamivir — 13 stuck Klaud Cold PRs

DSV4 — latest stable image didnt work (4 PRs)

#1461 — dsv4-fp8-h200-vllm (+mtp) → v0.21.0

#1460 — dsv4-fp8-h200-sglang (+mtp) → v0.5.12-cu130

#1455 — dsv4-fp4-b300-sglang (+mtp) → v0.5.12-cu130

#1450 — dsv4-fp4-b200-sglang → v0.5.12-cu130

Upstream bugs with ticket filed (4 PRs)

#1494 — Add glm5.1-fp4-mi355x-sglang-mtp recipe — sgl#25742 (AMD)

#1441 — Update glm5.1-fp4-mi355x-sglang SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517 — sgl#25742 (AMD)

#1451 — qwen3.5-fp8-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25863

#1420 — glm5-fp4-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25563

MI300 cluster down — waiting for firmware upgrades (3 PRs)

#1499 — Add dsr1-fp8-mi300x-sglang-mtp recipe

#1482 — Add qwen3.5-fp8-mi300x-sglang-mtp recipe

#1403 — Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0

Other (2 PRs)

#1512 — Test `sgl-deep-gemm==0.0.1` pin for sgl#25551 (glm5-fp8-b300 DeepGemm regression)

#1521 — Add dsr1-fp8-mi355x-sglang-mtp single-node MTP recipe

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Category	Count	PRs
DSV4 — needs custom image	4	~~#1461~~, #1460, #1455, #1450
Upstream bugs with ticket filed	4	#1494, #1441 (AMD → sgl#25742); ~~#1451 → sgl#25863~~; #1420 → sgl#25563
MI300 cluster down (firmware upgrades)	3	#1403, #1499, #1482
Other	2	~~#1512, #1521~~ done

Vendor	Category	Count	PRs
NVIDIA (H200/B200/B300) — 7 total	DSV4 — latest stable image didnt work	4	#1461 (H200), #1460 (H200), #1455 (B300), #1450 (B200)
	Upstream bugs with ticket filed	2	#1451 (B300 → sgl#25863), #1420 (B300 → sgl#25563)
	Other	1	#1512 (B300, sgl-deep-gemm pin test for sgl#25551)
AMD (MI300X/MI355X) — 6 total	Upstream bugs with ticket filed	2	#1494 (MI355X), #1441 (MI355X) — both AMD-acknowledged via sgl#25742
	MI300 cluster down (firmware upgrades)	3	#1403 (MI300X), #1499 (MI300X), #1482 (MI300X)
	Other	1	#1521 (MI355X, eval-only flake)

[AI Generated] [Handoff] out of 70+ image updates, 13 stuck Klaud Cold PRs need upstream coordination / scope decisions #1511

Description

human

below is [AI Generated]

Handoff to @Oseltamivir — 13 stuck Klaud Cold PRs

DSV4 — latest stable image didnt work (4 PRs)

#1461 — dsv4-fp8-h200-vllm (+mtp) → v0.21.0

#1460 — dsv4-fp8-h200-sglang (+mtp) → v0.5.12-cu130

#1455 — dsv4-fp4-b300-sglang (+mtp) → v0.5.12-cu130

#1450 — dsv4-fp4-b200-sglang → v0.5.12-cu130

Upstream bugs with ticket filed (4 PRs)

#1494 — Add glm5.1-fp4-mi355x-sglang-mtp recipe — sgl#25742 (AMD)

#1441 — Update glm5.1-fp4-mi355x-sglang SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517 — sgl#25742 (AMD)

#1451 — qwen3.5-fp8-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25863

#1420 — glm5-fp4-b300-sglang (+mtp) → v0.5.12-cu130 — sgl#25563

MI300 cluster down — waiting for firmware upgrades (3 PRs)

#1499 — Add dsr1-fp8-mi300x-sglang-mtp recipe

#1482 — Add qwen3.5-fp8-mi300x-sglang-mtp recipe

#1403 — Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0

Other (2 PRs)

#1512 — Test sgl-deep-gemm==0.0.1 pin for sgl#25551 (glm5-fp8-b300 DeepGemm regression)

#1521 — Add dsr1-fp8-mi355x-sglang-mtp single-node MTP recipe

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

#1512 — Test `sgl-deep-gemm==0.0.1` pin for sgl#25551 (glm5-fp8-b300 DeepGemm regression)