docs(conformance): publish reproducible evidence + policy conformance count (#120)#163
Merged
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: CI pipe masks conformance failure
- Added
set -o pipefailbefore the pipeline so the step exits withmake conformance's exit code instead oftee's.
- Added
Preview (0550a77729)
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -63,6 +63,11 @@
exit 1
fi
+ - name: Conformance count (evidence + policy paths)
+ run: |
+ set -o pipefail
+ make conformance | tee -a "$GITHUB_STEP_SUMMARY"
+
- name: OPA/Rego policy tests
run: |
if ! command -v opa &> /dev/null; then
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -16,8 +16,13 @@
GO_ENV := env -u CC CC=/usr/bin/clang CGO_ENABLED=1
endif
-.PHONY: help build install test test-integration test-e2e test-smoke test-all test-ssot-gate lint fmt clean vet mod-tidy check docker-build demo-gateway demo-full demo-clean verify-flow0 nosec-count
+.PHONY: help build install test test-integration test-e2e test-smoke test-all test-ssot-gate conformance lint fmt clean vet mod-tidy check docker-build demo-gateway demo-full demo-clean verify-flow0 nosec-count
+# Conformance suite: the evidence + policy paths whose passing test/subtest
+# count is published as Talon's honest conformance number. See
+# docs/reference/conformance.md.
+CONFORMANCE_PKGS := ./internal/policy/... ./internal/evidence/...
+
help: ## Show this help
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'
@@ -52,6 +57,12 @@
test-ssot-gate: ## Run consolidated SSOT parity/resilience gate tests
@$(GO_ENV) go test -count=1 ./internal/server -run SSOTGate
+conformance: ## Run the evidence + policy conformance suite and print the passing count
+ @out=$$($(GO_ENV) go test -count=1 -run . -v $(CONFORMANCE_PKGS) 2>&1); rc=$$?; \
+ count=$$(printf '%s\n' "$$out" | grep -c -- '--- PASS:'); \
+ if [ $$rc -ne 0 ]; then printf '%s\n' "$$out" | tail -20; echo "conformance: FAILED ($$count passing before failure)"; exit 1; fi; \
+ echo "Conformance: $$count passing tests across evidence + policy paths ($(CONFORMANCE_PKGS))"
+
lint: ## Run linter
@golangci-lint run ./...
diff --git a/README.md b/README.md
--- a/README.md
+++ b/README.md
@@ -138,6 +138,7 @@
- HMAC-SHA256 signed evidence record per request; verify with `talon audit verify`.
- Export to CSV, JSON, or signed JSON/NDJSON for auditors and offline verification.
- Supporting controls mapped to GDPR Article 30, NIS2, DORA, and EU AI Act traceability.
+- Conformance: **317 passing tests** across the evidence + policy paths — reproduce with `make conformance`. See [Conformance suite & count](docs/reference/conformance.md).
See [Evidence store](docs/explanation/evidence-store.md).
@@ -297,6 +298,7 @@
- [Policy cookbook](docs/guides/policy-cookbook.md)
- [Provider registry](docs/reference/provider-registry.md)
- [Evidence store](docs/explanation/evidence-store.md)
+- [Conformance suite & count](docs/reference/conformance.md)
- [Gateway dashboard](docs/reference/gateway-dashboard.md)
- [OpenClaw integration](docs/guides/openclaw-integration.md)
- [Slack bot integration](docs/guides/slack-bot-integration.md)
diff --git a/docs/README.md b/docs/README.md
--- a/docs/README.md
+++ b/docs/README.md
@@ -68,6 +68,7 @@
|-----|-------------|
| [Configuration and environment](reference/configuration.md) | Environment variables, crypto keys, and config reference. |
| [Evidence integrity specification](reference/evidence-integrity-spec.md) | Normative signed-record spec: fields, canonical serialization, HMAC-SHA256 signing, and the independent verification procedure. |
+| [Conformance suite & count](reference/conformance.md) | What counts as a conformance test for the evidence + policy paths, and how to reproduce the published count with `make conformance`. |
| [Authentication and key scopes](reference/authentication-and-key-scopes.md) | Which keys authenticate which endpoint families (gateway vs control plane vs dashboard). |
| [Gateway dashboard](reference/gateway-dashboard.md) | Dashboard endpoints, metrics API schema, snapshot fields, and authentication. |
| [Operational control plane](reference/operational-control-plane.md) | Run management (list/kill/pause/resume), tenant lockdown, runtime overrides, tool approval gates. |
@@ -95,6 +96,7 @@
| [Why not just a PII proxy?](explanation/why-not-a-pii-proxy.md) | Control-plane vs scrubber differentiation with proof commands. |
| [Evidence store](explanation/evidence-store.md) | HMAC integrity model and verification flow. |
| [Evidence integrity specification](reference/evidence-integrity-spec.md) | Byte-exact spec so a third party can independently verify a record. |
+| [Conformance suite & count](reference/conformance.md) | Reproducible passing-test count for the evidence + policy paths (`make conformance`). |
| [Evidence integrity 5-minute proof](tutorials/evidence-integrity-demo.md) | Fast proof moment for auditors/operators, including offline signed-export verification. |
| [Security policy](../SECURITY.md) | Vulnerability reporting process and security scope. |
| [Docker Compose demo](../examples/docker-compose/README.md) | Fastest no-key proof loop. |
diff --git a/docs/reference/conformance.md b/docs/reference/conformance.md
new file mode 100644
--- /dev/null
+++ b/docs/reference/conformance.md
@@ -1,0 +1,83 @@
+# Conformance Suite & Published Count
+
+**Status:** stable · **Scope:** the evidence and policy execution paths.
+
+Talon publishes a single, reproducible number: the count of passing tests across
+the two paths that carry its core guarantees — the **evidence** path (how records
+are built, signed, exported, and verified) and the **policy** path (how requests
+are classified, routed, and allowed or denied).
+
+The number is meant to be *checkable*, not impressive. Anyone can reproduce it
+from a clean checkout, and CI prints it on every run. The authoritative value is
+whatever `make conformance` reports for the commit you are looking at.
+
+## Reproduce it
+
+```bash
+make conformance
+```
+
+Example output:
+
+```
+Conformance: 317 passing tests across evidence + policy paths (./internal/policy/... ./internal/evidence/...)
+```
+
+The target runs `go test -count=1 -run . -v` over `./internal/policy/...` and
+`./internal/evidence/...`, then counts the `--- PASS:` lines emitted by the Go test
+runner. That count includes both top-level test functions and table-driven
+subtests, so each named case is counted once. `-count=1` disables the test cache,
+so the number is computed fresh every time. If any test fails, the target exits
+non-zero and prints the failure tail instead of a count.
+
+## What is in scope
+
+The count aggregates the test files in the two packages below. The list is
+descriptive — the suite is simply "every test in these two packages", so new tests
+raise the number automatically without touching this document.
+
+**Policy path — `internal/policy`**
+
+| File | Covers |
+|------|--------|
+| `engine_test.go` | Policy engine evaluate/decision logic |
+| `gateway_engine_test.go` | Gateway-mode policy evaluation |
+| `golden_test.go` | Golden policy decisions against `testdata/` fixtures |
+| `loader_test.go` | `.talon.yaml` policy loading and validation |
+| `routing_policy_test.go` | Tier-based model routing decisions |
+| `classifier_convert_test.go` | Classifier → policy-input conversion |
+| `proxy_test.go` | Proxy-mode policy enforcement |
+| `openclaw_gaps_test.go` | Regression cases for known governance gaps |
+| `metrics_test.go` | Policy decision metrics |
+
+**Evidence path — `internal/evidence`**
+
+| File | Covers |
+|------|--------|
+| `store_test.go` | Evidence record build, persist, query, tenant scoping |
+| `signed_export_test.go` | Signed JSON/NDJSON export and offline verification |
+| `integrity_spec_test.go` | Round-trip of the [evidence integrity spec](evidence-integrity-spec.md) |
+| `schema_compat_test.go` | Backward compatibility of the record schema |
+| `export_test.go` | CSV/JSON export shape |
+| `metrics_test.go` | Evidence write metrics |
+
+### Adjacent suites (counted separately)
+
+The embedded OPA/Rego policies have their own test suite that runs under the `opa`
+toolchain rather than `go test`, so it is **not** included in the Go conformance
+count. Run it with `make opa-test`. Integration and end-to-end tiers
+(`make test-integration`, `make test-e2e`) exercise the same paths through the
+running binary and are likewise tracked separately.
+
+## What the number means — and what it does not
+
+- It **does** mean: the evidence and policy code paths have this many passing,
+ deterministic checks that anyone can re-run, and a regression that breaks one of
+ them fails CI.
+- It **does not** mean: that the suite is exhaustive, that it covers every
+ configuration, or that a high count by itself demonstrates a compliance outcome.
+ Talon produces supporting controls and evidence; coverage and limitations are
+ documented in [`LIMITATIONS.md`](../../LIMITATIONS.md).
+
+The count is a floor that grows as tests are added; it is not a marketing target.
+Treat the live output of `make conformance` as the source of truth.You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit dd9b197. Configure here.
|
|
||
| - name: Conformance count (evidence + policy paths) | ||
| run: | | ||
| make conformance | tee -a "$GITHUB_STEP_SUMMARY" |
There was a problem hiding this comment.
CI pipe masks conformance failure
High Severity
The conformance step pipes make conformance into tee, but the workflow does not enable pipefail, so the step’s exit status is tee’s (usually zero) even when make conformance exits non-zero after test failures.
Reviewed by Cursor Bugbot for commit dd9b197. Configure here.
… count (#120) Add a `make conformance` target that runs the evidence + policy test paths and prints the passing test/subtest count (currently 317), failing if any test fails. Surface the number in CI (printed to the step summary) and in the README, and document what counts as a conformance test, how to reproduce it from a clean checkout, and what the number does and does not mean in docs/reference/conformance.md. Closes #120
0550a77 to
f87cc6a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Talon's honest answer to "how many conformance tests do you have?" — a single, reproducible number for the two paths that carry its core guarantees: the evidence path (build/sign/export/verify) and the policy path (classify/route/allow-deny).
make conformancerunsgo test -count=1 -run . -vover./internal/policy/...and./internal/evidence/..., counts passing tests/subtests, and printsConformance: N passing .... Currently 317. Fails (non-zero) if any test fails.docs/reference/conformance.mddefines what counts as a conformance test, the suite composition, the counting method, how to reproduce from a clean checkout, and what the number does/does not mean (no compliance-outcome overclaim).The published number is a floor that grows automatically as tests are added — the suite is simply "every test in those two packages", so no list needs hand-maintaining. The authoritative value is whatever
make conformanceprints for a given commit.Adjacent suites (OPA/Rego via
make opa-test, integration, e2e) are intentionally tracked separately and called out in the doc.Test plan
make conformance→Conformance: 317 passing tests across evidence + policy pathsscripts/check-claim-discipline.shpasses (doc avoids banned outcome phrasing)evidence-integrity-spec.md,../../LIMITATIONS.md)Closes #120
Note
Low Risk
Documentation and CI/Makefile tooling only; no runtime, auth, or production code paths change.
Overview
Adds a reproducible conformance count for Talon’s core evidence and policy Go test packages, so the published number is whatever
make conformanceprints for a given commit—not a hand-maintained figure.make conformancerunsgo test -count=1 -run . -vover./internal/policy/...and./internal/evidence/..., counts--- PASS:lines (including table subtests), prints a one-line total (currently 317 in docs), and fails CI if any test fails. CI adds a step that runs this target and appends output to the GitHub job step summary.docs/reference/conformance.mddefines scope, counting method, in-scope test files, adjacent suites (make opa-test, integration/e2e) excluded from the count, and explicit limits on what the number does not claim. README and docs index link the doc and surface the count under Evidence & compliance.Reviewed by Cursor Bugbot for commit dd9b197. Configure here.