ci(cli): run CLI unit tests in `ut-runtime-1gpu` by zhyncs · Pull Request #129 · lightseekorg/tokenspeed

zhyncs · 2026-05-13T19:10:40Z

Summary

Fold python3 -m pytest test/cli -v into the existing ut-runtime-1gpu task so the CLI orchestrator surface (argv splitter, banner, log prefix, proc helpers, dispatch) is exercised on every per-commit run alongside the runtime suite — no new workflow, no duplicate install path.
Why piggyback on a GPU runner instead of ubuntu-latest: _engine_recognized_flags() lazy-imports ServerArgs, which transitively pulls triton, flashinfer-python, and tokenspeed_kernel. Those need a CUDA build at install time, so a CPU-only runner can't even finish install_deps.sh. The existing 1-gpu runner already has the full stack staged.
Fix test_orchestrator_default_timeouts along the way. The orchestrator default was bumped from 600s to 1800s in perf: optimize flashinfer sampling backend #105 but the assertion was left at 600; refresh it so the suite is green when the task runs.

Test plan

pre-commit run --all-files
pytest test/cli -v locally — 57 passed, 1 skipped (the SMG integration test gracefully importorskips smg / smg_grpc_proto, which are not installed in the dev env).
ut-runtime-1gpu / b200-1gpu runs the CLI block via the per-commit trigger on this PR; merge gated on it being green.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d4e7b89fef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T19:14:51Z

+            torch==2.11.0 torchvision
+          # tokenspeed-kernel falls back to a pure-Python stub when CUDA is
+          # absent (see tokenspeed-kernel/python/setup.py).
+          pip install -e tokenspeed-kernel/python/


Avoid installing GPU kernel package in CPU-only CLI job

This workflow installs tokenspeed-kernel on ubuntu-latest, but the package setup currently computes dependencies via tokenspeed-kernel/python/setup.py::_selected_backend() and raises when neither CUDA nor ROCm is detected (RuntimeError at setup time). On a standard hosted Ubuntu runner, that makes pip install -e tokenspeed-kernel/python/ fail before pytest runs, so the new CLI workflow will go red for reasons unrelated to CLI changes. For this CPU-only suite, skip this install or force a backend explicitly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T19:14:51Z

+    branches: [main]
+    paths:
+      - "python/tokenspeed/cli/**"
+      - "python/tokenspeed/runtime/utils/server_args.py"


Expand CLI workflow path filters to cover serve_smg deps

The path filters only include python/tokenspeed/runtime/utils/server_args.py from runtime utils, but tokenspeed.cli.serve_smg also imports tokenspeed.runtime.utils.network and tokenspeed.runtime.utils.process (see python/tokenspeed/cli/serve_smg.py). A PR that breaks those modules can still break CLI behavior/tests while this workflow never triggers, which leaves a gap in the intended CLI regression coverage.

Useful? React with 👍 / 👎.

Fold `pytest test/cli` into the existing `ut-runtime-1gpu` task so the CLI orchestrator surface (argv splitter, banner, log prefix, proc helpers, dispatch) is exercised on every per-commit run alongside the runtime suite. CLI tests transitively import `ServerArgs`, which pulls `triton`, `flashinfer-python`, and `tokenspeed_kernel`; piggybacking on the GPU runner that already installs those deps avoids spinning up a separate workflow with a duplicate install path. The orchestrator timeout default was bumped from 600s to 1800s in #105 without updating `test_orchestrator_default_timeouts`; refresh the expected value so the suite is green before turning the task on. Signed-off-by: zhyncs <46627482+zhyncs@users.noreply.github.com>

…ightseekorg#131) Squash-rebase of ``codex/ds4-sm12x-poc`` onto ``upstream/main`` at ``dd9866f`` (Refine third-party attribution notices, lightseekorg#131). Picks up nine upstream commits: * ``dd9866f`` Refine third-party attribution notices (lightseekorg#131) * ``b6c4617`` feat(cli): disable smg circuit breaker and retries by default (lightseekorg#130) * ``f55fd2a`` feat(cli): accept positional model arg in ``ts serve`` (lightseekorg#128) * ``6333e23`` ci(cli): run CLI unit tests in ``ut-runtime-1gpu`` (lightseekorg#129) * ``db7cae6`` feat(cli): print TokenSpeed banner on ``ts serve`` startup (lightseekorg#127) * ``361eb09`` perf(K2.5): Optimize lm_head (lightseekorg#126) * ``c2299fd`` perf: optimize flashinfer sampling backend (lightseekorg#105) * ``962b83a`` perf(K2.5): enable AR-Norm fusion and fused FP8 decode for MLA Eagle3 (lightseekorg#124) * ``4da7a1c`` fix to use max_num_pages for spec-decode topk page_table buffers (lightseekorg#125) Fork delta replayed on top of the new base: 82 files changed, +22833 / -373. Conflict resolution: * ``python/tokenspeed/runtime/sampling/backends/flashinfer_full.py`` imports — took upstream's wider import block from lightseekorg#105 (added ``top_k_top_p_renorm_torch`` and ``write_output_top_logprobs``); references on lines 333 and 471 require them. Pre-rebase state preserved at branch ``codex/ds4-sm12x-poc-prerebase- 20260514`` for safety; previous round's backup ``codex/ds4-sm12x-poc-prerebase`` still tracks remote. Fork-specific work carried forward in this squash: * SM12x sparse-MLA + DSv4-Flash output projection + MXFP4 MoE kernels (``tokenspeed-kernel/python/tokenspeed_kernel/thirdparty/cuda/``). * DSv4-Flash runtime model + attention backend + per-kind K-split sparse MLA + indexer ds4-decode shortcut. * V2 Stage 1 attention aux-stream (post-projection overlap on SM12x). * Bench tool ``await_with_per_request_timeout`` + ``sock_read=120`` / ``sock_connect=30`` + ``test_bench_timeout.py``. * All experiment archives + failed-attempts log in ``docs/notes/``. Test plan: * AST-parse sanity on the 63 staged Python files: clean. * Pre-commit + workstation rebuild + sanity test sweep to follow on a separate turn before pushing. Signed-off-by: jasl <jasl9187@hotmail.com>

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

zhyncs force-pushed the zhyncs/cli-ci-test branch from d4e7b89 to cad92df Compare May 13, 2026 19:16

zhyncs changed the title ~~ci(cli): run CLI unit tests on ubuntu-latest~~ ci(cli): register CLI unit tests under ut-cli on b200-1gpu May 13, 2026

zhyncs force-pushed the zhyncs/cli-ci-test branch from cad92df to e82e063 Compare May 13, 2026 19:20

zhyncs changed the title ~~ci(cli): register CLI unit tests under ut-cli on b200-1gpu~~ ci(cli): run CLI unit tests in ut-runtime-1gpu May 13, 2026

lightseek-bot approved these changes May 13, 2026

View reviewed changes

lightseek-bot merged commit 6333e23 into main May 13, 2026
2 of 3 checks passed

lightseek-bot deleted the zhyncs/cli-ci-test branch May 13, 2026 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(cli): run CLI unit tests in `ut-runtime-1gpu`#129

ci(cli): run CLI unit tests in `ut-runtime-1gpu`#129
lightseek-bot merged 1 commit into
mainfrom
zhyncs/cli-ci-test

zhyncs commented May 13, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 13, 2026

Uh oh!

chatgpt-codex-connector Bot May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhyncs commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhyncs commented May 13, 2026 •

edited

Loading