Skip to content

[codex] Add inference telemetry release gates#3

Merged
WaffleBits merged 3 commits into
mainfrom
codex/runtime-telemetry-gates-20260624
Jun 26, 2026
Merged

[codex] Add inference telemetry release gates#3
WaffleBits merged 3 commits into
mainfrom
codex/runtime-telemetry-gates-20260624

Conversation

@WaffleBits

@WaffleBits WaffleBits commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Summary

  • add mirrored inference telemetry for model version, queue depth, KV memory pressure, TTFT, decode-token latency, and token-trace fingerprints
  • extend release reports with aggregate and segmented telemetry fields
  • hold candidates for TTFT, decode-token p95, or memory-pressure regressions while preserving rollback for correctness, numeric drift, and error-rate failures
  • add replay pressure summaries for max queued requests, max active requests, peak KV pressure, queued-pressure ticks, active-capacity ticks, and pressure ratios
  • add a checked workload-pressure fixture/artifact that completes eight mixed-priority requests in 27 ticks and peaks at 86.666667% KV pressure
  • update docs, tests, fixtures, and checked artifacts

Validation

  • cargo fmt --all --check
  • cargo test --all-targets
  • parsed all JSON fixtures/artifacts with PowerShell ConvertFrom-Json
  • git diff --check

@WaffleBits WaffleBits force-pushed the codex/runtime-telemetry-gates-20260624 branch from de9425a to cf99d78 Compare June 24, 2026 21:35
@WaffleBits WaffleBits marked this pull request as ready for review June 26, 2026 13:06
@WaffleBits WaffleBits merged commit 36d6001 into main Jun 26, 2026
1 check passed
@WaffleBits WaffleBits deleted the codex/runtime-telemetry-gates-20260624 branch June 26, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant