Skip to content

perf(laguna): CUDA-graph replay decode — 113→143 tok/s all-GPU, 101→129 at 60% residency#358

Merged
davide221 merged 6 commits into
mainfrom
feat/laguna-cudagraph-replay
Jun 10, 2026
Merged

perf(laguna): CUDA-graph replay decode — 113→143 tok/s all-GPU, 101→129 at 60% residency#358
davide221 merged 6 commits into
mainfrom
feat/laguna-cudagraph-replay

ci: run self-hosted GPU jobs without waiting for the hosted CPU build

ece88de
Select commit
Loading
Failed to load commit list.