Bug: Qwen3.6-35B-A3B-4bit tg TPS=0 with pp1024/tg128 benchmark (MoE short-context issue)

## Bug Description

When running benchmark on `Qwen3.6-35B-A3B-4bit`, the token generation (tg) phase
is completely skipped for `pp1024/tg128` test — `tg TPS=0.0` and `TPOT=0.00ms`.
The `pp4096/tg128` test works correctly.

## Environment

- Hardware: MacBook Air M5 32GB
- Model: `Qwen3.6-35B-A3B-4bit`
- Config:
  - ctx_window: 65536
  - enable_thinking: false
  - TurboQuant KV Cache 4bit
  - SpecPrefill: Qwen3.5-2B-OptiQ-4bit
  - DFlash: Qwen3.5-9B-DFlash

## Benchmark Results

**Run 1:**
Test              TTFT(ms)  TPOT(ms)    pp TPS    tg TPS   E2E(s)
pp1024/tg128      6905.7    0.00     148.3 tok/s   0.0     6.906   ← tg skipped
pp4096/tg128     24207.7   38.39     169.2 tok/s  26.3    29.083   ← normal

**Run 2 (reproduced):**
Test              TTFT(ms)  TPOT(ms)    pp TPS    tg TPS   E2E(s)
pp1024/tg128      8439.9    0.00     121.3 tok/s   0.0     8.440   ← tg skipped
pp4096/tg128     23722.6   37.00     172.7 tok/s  27.2    28.422   ← normal

## Analysis

- `E2E ≈ TTFT` in pp1024 case, confirming tg phase never executes
- Replacing SpecPrefill draft model has no effect — not a draft model issue
- pp4096 works normally, suggesting a short-context boundary condition
  specific to this MoE model (Qwen3.6-35B-A3B)

## Expected Behavior

`pp1024/tg128` should generate 128 tokens after prefill, same as `pp4096/tg128`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Qwen3.6-35B-A3B-4bit tg TPS=0 with pp1024/tg128 benchmark (MoE short-context issue) #979

Bug Description

Environment

Benchmark Results

Analysis

Expected Behavior

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug: Qwen3.6-35B-A3B-4bit tg TPS=0 with pp1024/tg128 benchmark (MoE short-context issue) #979

Description

Bug Description

Environment

Benchmark Results

Analysis

Expected Behavior

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions