docs(benchmarks): reproducible gateway, PII, and evidence benchmarks (#119) by sergeyenin · Pull Request #164 · dativo-io/talon

sergeyenin · 2026-06-03T10:21:26Z

Summary

Closes the last P1 in credibility epic #108 — reproducible benchmarks for the README "< 15 ms excluding upstream" claim.

BenchmarkGatewayPipelineOverhead — full ServeHTTP path with OPA policy, PII scan, local mock upstream, response scan, signed evidence write (rate limits raised for bench stability).
make benchmarks / scripts/run-benchmarks.sh — runs gateway + existing BenchmarkPIIScan + BenchmarkEvidenceStore, prints a markdown table with go version, OS, CPU, commit, and raw go test lines.
docs/reference/benchmarks.md — methodology, scope, exclusions (WAN RTT, retry/fallback until Provider fallback chains (error-driven, sovereignty-respecting) #138/Retries with backoff (recorded as evidence fact) #139), interpretation guide.
Links from README, LIMITATIONS, docs index, and request-lifecycle throughput section.

Local sample (Apple M1 Max, make benchmarks): gateway ~5.5 ms/req, PII ~0.08 ms/scan, evidence ~1360 writes/s — under the 15 ms budget with mock upstream.

Test plan

make benchmarks succeeds
scripts/check-claim-discipline.sh passes
Gateway bench no longer hits 429 rate limit

Closes #119

Note

Low Risk
Documentation and benchmark harness only; no production gateway behavior changes beyond a new test with elevated rate limits for stability.

Overview
Adds a reproducible proof-bar for the README “under 15 ms excluding upstream” pipeline claim: operators run make benchmarks (or scripts/run-benchmarks.sh) to get a markdown table of gateway overhead, PII scan latency, and evidence write throughput on their machine.

New gateway benchmark BenchmarkGatewayPipelineOverhead exercises a full non-streaming ServeHTTP path against a local httptest upstream (OPA, PII, response scan, signed evidence), with rate limits raised so the bench does not 429. The runner aggregates that benchmark with existing BenchmarkPIIScan and BenchmarkEvidenceStore, records Go/OS/CPU/commit, and dumps raw go test lines.

docs/reference/benchmarks.md documents methodology, what is in/out of scope (no WAN RTT, no retry/fallback until Epic #113), and how to interpret results. LIMITATIONS and doc indexes now point at reproducible benchmarks instead of a vague “forthcoming” note; README and the request-lifecycle doc link make benchmarks alongside the optional docker/hey load harness.

^{Reviewed by Cursor Bugbot for commit fa7270f. Configure here.}

…rks (#119) Add BenchmarkGatewayPipelineOverhead (ServeHTTP with local mock upstream), scripts/run-benchmarks.sh, and make benchmarks to emit a markdown table with hardware metadata. Document methodology in docs/reference/benchmarks.md and link from README, LIMITATIONS, and the request-lifecycle doc. Closes #119

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON. A cloud agent has been kicked off to fix the reported issue.}

^{Reviewed by Cursor Bugbot for commit fa7270f. Configure here.}

cursor · 2026-06-03T10:23:45Z

+		if w.Code != http.StatusOK {
+			b.Fatalf("status %d: %s", w.Code, w.Body.String())
+		}
+	}


Gateway bench cost query drift

Medium Severity

BenchmarkGatewayPipelineOverhead times repeated ServeHTTP calls against one SQLite evidence store that grows every iteration. Each request runs callerCostTotals, which scans accumulating rows via CostByAgent, so measured ns/op rises during the run and overstates steady per-request overhead versus a fixed-size store.

^{Reviewed by Cursor Bugbot for commit fa7270f. Configure here.}

cursor Bot reviewed Jun 3, 2026

View reviewed changes

sergeyenin merged commit 93c9d53 into main Jun 3, 2026
9 checks passed

sergeyenin deleted the docs/benchmarks-119 branch June 3, 2026 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(benchmarks): reproducible gateway, PII, and evidence benchmarks (#119)#164

docs(benchmarks): reproducible gateway, PII, and evidence benchmarks (#119)#164
sergeyenin merged 1 commit into
mainfrom
docs/benchmarks-119

sergeyenin commented Jun 3, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sergeyenin commented Jun 3, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 3, 2026

Choose a reason for hiding this comment

Gateway bench cost query drift

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sergeyenin commented Jun 3, 2026 •

edited by cursor Bot

Loading