feat(audit): add oddkit_audit MCP action — Phase 2 PR-2.3a#143
Merged
Conversation
…link detection
Phase 2 PR-2.3 of the link-rot-elimination campaign. Implements
oddkit_audit per klappy://docs/oddkit/specs/oddkit-audit (DRAFT v2 — KISS).
Walks every markdown file in scope (writings/, canon/, odd/, docs/,
excluding docs/archive/). For each link target:
- klappy:// URI: resolves through the index (with same shape-tolerance
as oddkit_resolve for superseded_by chains). NOT_FOUND or circular
→ dead-reference error.
- /page/... or ./*.md in writings/: legacy-link-pattern error.
- everything else (external, anchors, valid non-klappy paths): ignored.
Line-level allowlist via <!-- audit-allow: <rule-id> reason="..." -->.
Suppressed findings returned in a separate envelope field so reviewers
can challenge the reason.
Bounded by MAX_AUDIT_FILES=1000 and MAX_AUDIT_FINDINGS=500 with truncation
flagged in summary.truncated. Production canon is ~560 docs; well below cap.
Three places updated for the new action surface (per
klappy://canon/constraints/oddkit-action-registration-completeness):
- dispatch switch in handleUnifiedAction
- VALID_ACTIONS array
- central router enum + standalone tool definition
Two new smoke tests added:
- 14j: default-scope audit returns OK or FINDINGS with valid summary
- 14k: narrow-scope audit honors paths filter
Vodka discipline preserved: v1 of spec proposed four checks plus
supporting registries; v2 cut to one check, two rule_ids. Other checks
deferred per klappy://docs/planning/link-rot-deferred-concerns.
Version bump: 0.25.0 → 0.26.0
Refs:
- Spec: klappy://docs/oddkit/specs/oddkit-audit (DRAFT v2)
- Resolver: klappy://docs/oddkit/specs/oddkit-resolve (in prod v0.25.0)
- Principle: klappy://canon/principles/identity-resolved-by-protocol
- Bug-class lessons (separate canon PR in klappy/klappy.dev):
klappy://canon/constraints/oddkit-action-registration-completeness
klappy://canon/constraints/superseded-by-shape-normalization
klappy://canon/constraints/bash-test-rig-assignment-chain-discipline
- Canon basis: klappy://canon/constraints/release-validation-gate,
klappy://canon/principles/vodka-architecture
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
oddkit | 6bc0595 | Commit Preview URL Branch Preview URL |
Apr 26 2026, 10:26 PM |
…, findings cap - Remove no-op ternary in handleUnifiedAction audit dispatch - Preserve audit-allow suppression across blank/prose lines until a link is seen - Surface suppression reason on suppressed findings via suppression_reason field - Match runResolve.lookupSuccessor normalization in uriResolves (.md stem fallback) - Honor MAX_AUDIT_FINDINGS within per-line loop to enforce the 500-per-call cap
…chema bridge - audit: suppression directives now expire only on finding-producing links, not on out-of-scope links classifyLink ignores. - audit: depth-cap exhaustion in uriResolves now matches runResolve -- treat as circular only when the last entry still declares a successor. - audit: drop unreachable uriExists helper; uriResolves is only invoked with klappy:// URIs, so an absent index entry is a definitive miss. - bridge: normalize object input to a JSON string before calling handleUnifiedAction so UnifiedParams.input: string holds at runtime for oddkit_audit's union schema.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Suppression directive expires prematurely on non-matching findings
- Removed the post-line expiration block (and its now-unused lineHadFinding tracker) so an audit-allow directive remains pending across non-matching findings until its rule_id-matched finding is encountered, restoring the documented "suppresses next matching finding" contract.
Preview (477bc2f2b3)
diff --git a/CHANGELOG.md b/CHANGELOG.md
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,32 @@
## [Unreleased]
+## [0.26.0] - 2026-04-26
+
+### Added
+
+- **`oddkit_audit` MCP action — mechanical detection of dead `klappy://` references and legacy markdown link patterns.** Per `klappy://docs/oddkit/specs/oddkit-audit` (DRAFT v2 — KISS). Walks every markdown file in the configured scope, classifies each link, emits structured findings. Two `rule_id`s: `dead-reference` (a `klappy://` URI that doesn't resolve through the index, including chains that end NOT_FOUND or cycle) and `legacy-link-pattern` (a `[label](/page/...)` or `[label](./*.md)` pattern in `writings/` — the patterns that caused the original reader complaints). Both severity `error` by default. Line-level allowlist via `<!-- audit-allow: <rule-id> reason="..." -->` directives. Returns suppressed findings separately so reviewers can challenge suppression reasons. Wired into the unified `oddkit` router (`action: "audit"`), exposed as a standalone `oddkit_audit` tool. Backward-compatible — purely additive. Internal supersession-walk shares normalization logic with `oddkit_resolve` (path/.md/URI shapes per `klappy://canon/constraints/superseded-by-shape-normalization`). Phase 2 PR-2.3 of the link-rot-elimination campaign.
+
+### Notes
+
+- **Vodka discipline preserved.** v1 of the spec proposed four checks (dead-references + terminological-drift + projection-staleness + epoch-gaps) plus a deprecated-terms registry, epoch-completeness rules, and an `audit_allow:` frontmatter field. v2 cut to one check, two rule_ids, line-level allowlist only. The other three checks moved to the deferred-concerns ledger with explicit revisit triggers.
+- **Three places updated for the new action surface** per `klappy://canon/constraints/oddkit-action-registration-completeness`: dispatch switch, `VALID_ACTIONS` array, central router enum + standalone tool definition. Smoke tests confirmed before push.
+- **No `PARTIAL_INDEX` status in v1.** Same as resolve: matches existing convention. If real cold-start visibility becomes load-bearing, follow-up.
+- **`since_commit` parameter accepted but ignored in v1.** The worker has no git access; CI workflows can pass file lists via `paths` instead. Documented in spec; reserves the field for a future implementation that reads from a git mirror or works against staged files.
+- **Bounded by `MAX_AUDIT_FILES=1000` and `MAX_AUDIT_FINDINGS=500`.** When truncated, `summary.truncated: true` flags it. Production canon is ~560 docs today; well below the cap.
+
+### Refs
+
+- Spec: `klappy://docs/oddkit/specs/oddkit-audit` (DRAFT v2 — KISS)
+- Resolver dependency: `klappy://docs/oddkit/specs/oddkit-resolve` (DRAFT v4 — in production at v0.25.0)
+- Principle: `klappy://canon/principles/identity-resolved-by-protocol`
+- Campaign: `klappy://docs/planning/link-rot-elimination-campaign`
+- Bug-class lessons (separate canon PR in klappy/klappy.dev):
+ - `klappy://canon/constraints/oddkit-action-registration-completeness`
+ - `klappy://canon/constraints/superseded-by-shape-normalization`
+ - `klappy://canon/constraints/bash-test-rig-assignment-chain-discipline`
+- Canon basis: `klappy://canon/constraints/release-validation-gate`, `klappy://canon/principles/vodka-architecture`, `klappy://canon/principles/ritual-is-a-smell`
+
## [0.25.0] - 2026-04-26
### Added
diff --git a/package-lock.json b/package-lock.json
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
{
"name": "oddkit",
- "version": "0.25.0",
+ "version": "0.26.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "oddkit",
- "version": "0.25.0",
+ "version": "0.26.0",
"license": "MIT",
"dependencies": {
"@modelcontextprotocol/sdk": "^1.0.0",
diff --git a/package.json b/package.json
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
{
"name": "oddkit",
- "version": "0.25.0",
+ "version": "0.26.0",
"description": "Agent-first CLI for ODD-governed repos. Epistemic terrain rendering with portable baseline.",
"type": "module",
"bin": {
diff --git a/tests/cloudflare-production.test.sh b/tests/cloudflare-production.test.sh
--- a/tests/cloudflare-production.test.sh
+++ b/tests/cloudflare-production.test.sh
@@ -494,6 +494,64 @@
FAILED=$((FAILED + 1))
fi
+# Test 14j: oddkit_audit — basic invocation, returns OK or FINDINGS
+# Per klappy://docs/oddkit/specs/oddkit-audit. Walks every klappy:// URI in canon
+# markdown and emits findings for those that don't resolve, plus legacy markdown
+# link patterns in writings/.
+echo ""
+echo "Test 14j: tools/call oddkit_audit (default scope)"
+RAW=$(curl -sf --max-time 120 "$WORKER_URL/mcp" -X POST \
+ -H "Content-Type: application/json" \
+ -H "Accept: application/json, text/event-stream" \
+ -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"oddkit_audit","arguments":{}}}')
+RESULT=$(extract_json "$RAW")
+INNER=$(echo "$RESULT" | python3 -c "import sys, json; d=json.load(sys.stdin); print(d.get('result',{}).get('content',[{}])[0].get('text',''))" 2>/dev/null)
+if echo "$INNER" | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+r = d.get('result', {})
+status = r.get('status')
+assert status in ('OK', 'FINDINGS'), f'unexpected status: {status}'
+summary = r.get('summary', {})
+assert 'total_findings' in summary, 'missing summary.total_findings'
+assert 'by_severity' in summary, 'missing summary.by_severity'
+assert summary.get('files_scanned', 0) > 10, f'suspiciously few files scanned: {summary.get(\"files_scanned\")}'
+" 2>/dev/null; then
+ echo "PASS - audit returns OK or FINDINGS with valid summary"
+ PASSED=$((PASSED + 1))
+else
+ echo "FAIL - audit response shape unexpected"
+ echo " Inner: $(echo "$INNER" | head -c 600)"
+ FAILED=$((FAILED + 1))
+fi
+
+# Test 14k: oddkit_audit — narrow scope (single path)
+echo ""
+echo "Test 14k: tools/call oddkit_audit (narrow scope: writings/ only)"
+RAW=$(curl -sf --max-time 120 "$WORKER_URL/mcp" -X POST \
+ -H "Content-Type: application/json" \
+ -H "Accept: application/json, text/event-stream" \
+ -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"oddkit_audit","arguments":{"input":{"scope":{"paths":["writings/"]}}}}}')
+RESULT=$(extract_json "$RAW")
+INNER=$(echo "$RESULT" | python3 -c "import sys, json; d=json.load(sys.stdin); print(d.get('result',{}).get('content',[{}])[0].get('text',''))" 2>/dev/null)
+if echo "$INNER" | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+r = d.get('result', {})
+scope = r.get('scope', {})
+paths = scope.get('paths', [])
+assert paths == ['writings/'], f'scope echoed back unexpectedly: {paths}'
+status = r.get('status')
+assert status in ('OK', 'FINDINGS'), f'unexpected status: {status}'
+" 2>/dev/null; then
+ echo "PASS - audit honors narrow scope"
+ PASSED=$((PASSED + 1))
+else
+ echo "FAIL - audit narrow scope shape unexpected"
+ echo " Inner: $(echo "$INNER" | head -c 600)"
+ FAILED=$((FAILED + 1))
+fi
+
# ============================================
# SECTION 4: Response Content Validation
# ============================================
diff --git a/workers/package-lock.json b/workers/package-lock.json
--- a/workers/package-lock.json
+++ b/workers/package-lock.json
@@ -1,12 +1,12 @@
{
"name": "oddkit-mcp-worker",
- "version": "0.25.0",
+ "version": "0.26.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "oddkit-mcp-worker",
- "version": "0.25.0",
+ "version": "0.26.0",
"dependencies": {
"agents": "^0.4.1",
"fflate": "^0.8.2",
diff --git a/workers/package.json b/workers/package.json
--- a/workers/package.json
+++ b/workers/package.json
@@ -1,6 +1,6 @@
{
"name": "oddkit-mcp-worker",
- "version": "0.25.0",
+ "version": "0.26.0",
"private": true,
"type": "module",
"scripts": {
diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -193,13 +193,14 @@
server.tool(
"oddkit",
- `Epistemic guide for Outcomes-Driven Development. Routes to orient, challenge, gate, encode, search, get, resolve, catalog, validate, preflight, version, or cleanup_storage actions.
+ `Epistemic guide for Outcomes-Driven Development. Routes to orient, challenge, gate, encode, search, get, resolve, audit, catalog, validate, preflight, version, or cleanup_storage actions.
Use when:
- Starting work: action="orient" to assess epistemic mode
- Policy/canon questions: action="search" with your query
- Fetching a specific doc: action="get" with URI
- Resolving a URI to its current canonical answer (walks supersession): action="resolve" with URI
+- Auditing canon for dead references and legacy link patterns: action="audit" (CI use)
- Pressure-testing claims: action="challenge"
- Checking transition readiness: action="gate"
- Recording decisions: action="encode"
@@ -208,7 +209,7 @@
- Listing available docs: action="catalog"`,
{
action: z.enum([
- "orient", "challenge", "gate", "encode", "search", "get", "resolve",
+ "orient", "challenge", "gate", "encode", "search", "get", "resolve", "audit",
"catalog", "validate", "preflight", "version", "cleanup_storage",
]).describe("Which epistemic action to perform."),
input: z.string().describe("Primary input — query, claim, URI, goal, or completion claim depending on action."),
@@ -347,6 +348,16 @@
annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
},
{
+ name: "oddkit_audit",
+ description: "Walk every klappy:// URI in canon markdown and emit findings for those that don't resolve, plus any legacy markdown link patterns (/page/..., ./*.md) in writings/. Returns structured findings with rule_id, severity, location, occurrence, message. Designed for CI use. Per klappy://docs/oddkit/specs/oddkit-audit (DRAFT v2 — KISS).",
+ action: "audit",
+ schema: {
+ input: z.union([z.string(), z.object({}).passthrough()]).optional().describe("Optional scope: { paths: string[], since_commit?: string }. Default scope: writings/, canon/, odd/, docs/ (excluding docs/archive/). Pass as object or JSON string."),
+ knowledge_base_url: z.string().optional().describe("Optional: GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier."),
+ },
+ annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
+ },
+ {
name: "oddkit_catalog",
description: "Lists available documentation with categories, counts, and start-here suggestions. Supports temporal discovery: use sort_by='date' to get recent articles with full frontmatter metadata.",
action: "catalog",
@@ -405,9 +416,19 @@
tool.schema,
tool.annotations,
async (args: Record<string, unknown>) => {
+ // Most tools declare `input` as a string, but oddkit_audit accepts
+ // an object scope as well. Normalize objects to a JSON string so
+ // the UnifiedParams.input: string contract holds for every action.
+ const rawInput = args.input;
+ const normalizedInput =
+ typeof rawInput === "string"
+ ? rawInput
+ : rawInput && typeof rawInput === "object"
+ ? JSON.stringify(rawInput)
+ : "";
const result = await handleUnifiedAction({
action: tool.action,
- input: (args.input as string) || "",
+ input: normalizedInput,
context: args.context as string | undefined,
mode: args.mode as string | undefined,
knowledge_base_url: args.knowledge_base_url as string | undefined,
diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -1775,6 +1775,303 @@
return "/" + uri.slice("klappy://".length);
}
+// ──────────────────────────────────────────────────────────────────────────────
+// runAudit — mechanical detection of dead klappy:// references and legacy
+// markdown link patterns.
+//
+// Per klappy://docs/oddkit/specs/oddkit-audit (DRAFT v2 — KISS): walk every
+// `klappy://` URI in canon, call resolve internally on each, report findings.
+// Plus one additional rule: legacy markdown patterns `/page/...` and
+// `./*.md` in writings/ are emitted as `legacy-link-pattern` errors.
+//
+// One check, two rule_ids. Other audit checks (terminological-drift,
+// projection-staleness, epoch-gap) are deferred per
+// klappy://docs/planning/link-rot-deferred-concerns.
+// ──────────────────────────────────────────────────────────────────────────────
+
+interface AuditFinding {
+ rule_id: "dead-reference" | "legacy-link-pattern";
+ severity: "error" | "warning";
+ location: { path: string; line: number };
+ occurrence: string;
+ message: string;
+ suppression_reason?: string;
+}
+
+interface AuditScope {
+ paths?: string[];
+ // since_commit is part of the spec but not implementable from the worker without
+ // git access. CI workflows can pass file lists via paths instead. Documented in
+ // the action's input schema; ignored here for v1.
+ since_commit?: string;
+}
+
+const DEFAULT_AUDIT_PATHS = ["writings/", "canon/", "odd/", "docs/"];
+const AUDIT_EXCLUDE_PREFIXES = ["docs/archive/"];
+const MAX_AUDIT_FILES = 1000;
+const MAX_AUDIT_FINDINGS = 500;
+
+// Match [label](target) — non-greedy label, balanced-paren-naive target (good
+// enough for the link forms canon uses; nested parens in URIs are rare and
+// handled by simply taking up to the first `)`).
+const MARKDOWN_LINK_RE = /\[([^\]]*?)\]\(([^)\s]+)(?:\s+"[^"]*")?\)/g;
+
+// Match the line-level allowlist directive. Captures: rule_id, optional reason.
+// <!-- audit-allow: dead-reference reason="placeholder" -->
+const AUDIT_ALLOW_RE = /<!--\s*audit-allow:\s*([a-z-]+)(?:\s+reason="([^"]*)")?\s*-->/;
+
+async function runAudit(
+ input: AuditScope | string | undefined,
+ fetcher: KnowledgeBaseFetcher,
+ knowledgeBaseUrl?: string,
+ state?: OddkitState,
+): Promise<ActionResult> {
+ const startMs = Date.now();
+ const updatedState = state ? initState(state) : undefined;
+
+ // Normalize input: accept scope object, JSON string, or undefined (= defaults).
+ let scope: AuditScope = {};
+ if (typeof input === "string" && input.trim().length > 0) {
+ try {
+ const parsed = JSON.parse(input);
+ if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) {
+ scope = parsed.scope || parsed;
+ }
+ } catch {
+ // Ignore — empty scope is a valid full-default audit
+ }
+ } else if (input && typeof input === "object" && !Array.isArray(input)) {
+ scope = (input as { scope?: AuditScope }).scope || (input as AuditScope);
+ }
+
+ const paths = Array.isArray(scope.paths) && scope.paths.length > 0
+ ? scope.paths
+ : DEFAULT_AUDIT_PATHS;
+
+ const index = await fetcher.getIndex(knowledgeBaseUrl);
+
+ // Build URI lookup for inline resolution. Same logic as runResolve's
+ // lookupSuccessor + initial lookup, but inlined here to avoid the overhead
+ // of constructing full ActionResult envelopes for each URI.
+ const byUri = new Map<string, IndexEntry>();
+ const byPath = new Map<string, IndexEntry>();
+ for (const entry of index.entries) {
+ if (entry.uri) byUri.set(entry.uri, entry);
+ if (entry.path) byPath.set(entry.path, entry);
+ }
+
+ // Walk a klappy:// URI through any superseded_by chain to a terminus,
+ // matching runResolve's algorithm. Returns true iff the chain reaches
+ // a stable terminus (FOUND); false on NOT_FOUND or CIRCULAR_SUPERSESSION.
+ // Only invoked from classifyLink with klappy:// URIs, so an absent entry
+ // is a definitive NOT_FOUND.
+ function uriResolves(uri: string): boolean {
+ const start = byUri.get(uri);
+ if (!start) return false;
+ let current: IndexEntry = start;
+ const visited = new Set<string>([current.uri]);
+ for (let depth = 0; depth < 16; depth++) {
+ const fm = current.frontmatter || {};
+ const next = fm.superseded_by;
+ if (typeof next !== "string" || next.length === 0) return true;
+ // Resolve next via same shape-tolerance as runResolve.lookupSuccessor
+ let nextEntry: IndexEntry | undefined = byUri.get(next) || byPath.get(next);
+ if (!nextEntry && !next.startsWith("klappy://") && !next.endsWith(".md")) {
+ nextEntry = byPath.get(next + ".md");
+ }
+ if (!nextEntry && !next.startsWith("klappy://")) {
+ const stem = next.endsWith(".md") ? next.slice(0, -".md".length) : next;
+ nextEntry = byUri.get("klappy://" + stem);
+ }
+ if (!nextEntry) {
+ // Chain points at unknown successor — runResolve treats this as FOUND
+ // with warning; the audit treats the URI as "resolves" because the
+ // last known entry is a real document.
+ return true;
+ }
+ const nextCanonical = nextEntry.uri;
+ if (visited.has(nextCanonical)) return false; // circular
+ visited.add(nextCanonical);
+ current = nextEntry;
+ }
+ // Depth-cap exhausted — match runResolve: only circular if the last
+ // entry still declares a further successor. Otherwise the chain
+ // properly terminates and the URI resolves.
+ const finalFm = current.frontmatter || {};
+ if (typeof finalFm.superseded_by === "string" && finalFm.superseded_by.length > 0) {
+ return false;
+ }
+ return true;
+ }
+
+ // Filter the index to markdown files within the configured scope.
+ const inScope = (path: string): boolean => {
+ if (!path.endsWith(".md")) return false;
+ if (AUDIT_EXCLUDE_PREFIXES.some((p) => path.startsWith(p))) return false;
+ return paths.some((p) => path.startsWith(p));
+ };
+
+ const targetPaths = index.entries
+ .filter((e) => inScope(e.path))
+ .map((e) => e.path)
+ .slice(0, MAX_AUDIT_FILES);
+
+ const findings: AuditFinding[] = [];
+ const suppressedFindings: AuditFinding[] = [];
+ let truncated = false;
+ let filesScanned = 0;
+
+ for (const path of targetPaths) {
+ if (findings.length >= MAX_AUDIT_FINDINGS) {
+ truncated = true;
+ break;
+ }
+ const content = await fetcher.getFile(path, knowledgeBaseUrl);
+ if (!content) continue;
+ filesScanned++;
+ const isWriting = path.startsWith("writings/");
+
+ const lines = content.split("\n");
+ // Track allowlist directives: when one appears, it suppresses the next
+ // finding of the matching rule_id on the *next* link (any subsequent line).
+ let pendingSuppress: { rule: string; reason: string | null; lineSeen: number } | null = null;
+
+ for (let lineIdx = 0; lineIdx < lines.length; lineIdx++) {
+ if (truncated) break;
+ const line = lines[lineIdx];
+
+ // Check for allowlist directive on this line
+ const allowMatch = AUDIT_ALLOW_RE.exec(line);
+ if (allowMatch) {
+ pendingSuppress = {
+ rule: allowMatch[1],
+ reason: allowMatch[2] || null,
+ lineSeen: lineIdx + 1,
+ };
+ // Don't continue — allowlist directives may sit on a line that also
+ // contains a link they are NOT meant to suppress (rare, but possible).
+ // The directive applies to the next link encountered.
+ }
+
+ // Reset link-finder regex state per line
+ MARKDOWN_LINK_RE.lastIndex = 0;
+ let linkMatch: RegExpExecArray | null;
+ while ((linkMatch = MARKDOWN_LINK_RE.exec(line)) !== null) {
+ const target = linkMatch[2];
+
+ const finding = classifyLink(target, path, lineIdx + 1, isWriting, uriResolves);
+ if (!finding) continue;
+
+ // Apply pending suppression if the rule matches
+ if (pendingSuppress && pendingSuppress.rule === finding.rule_id) {
+ if (pendingSuppress.reason) {
+ finding.suppression_reason = pendingSuppress.reason;
+ }
+ suppressedFindings.push(finding);
+ pendingSuppress = null;
+ continue;
+ }
+
+ findings.push(finding);
+ if (findings.length >= MAX_AUDIT_FINDINGS) {
+ truncated = true;
+ break;
+ }
+ }
+ }
+ }
+
+ const errorCount = findings.filter((f) => f.severity === "error").length;
+ const warningCount = findings.filter((f) => f.severity === "warning").length;
+
+ const status: "OK" | "FINDINGS" =
+ findings.length === 0 ? "OK" : "FINDINGS";
+
+ const summaryByRule: Record<string, number> = {};
+ for (const f of findings) {
+ summaryByRule[f.rule_id] = (summaryByRule[f.rule_id] || 0) + 1;
+ }
+
+ return {
+ action: "audit",
+ result: {
+ status,
+ summary: {
+ total_findings: findings.length,
+ by_severity: { error: errorCount, warning: warningCount },
+ by_rule: summaryByRule,
+ files_scanned: filesScanned,
+ suppressed_count: suppressedFindings.length,
+ truncated,
+ },
+ findings,
+ ...(suppressedFindings.length > 0 ? { suppressed_findings: suppressedFindings } : {}),
+ scope: { paths, excluded_prefixes: AUDIT_EXCLUDE_PREFIXES },
+ },
+ state: updatedState,
+ assistant_text:
+ findings.length === 0
+ ? `Audited ${filesScanned} files. No findings.`
+ : `Audited ${filesScanned} files. ${errorCount} error${errorCount === 1 ? "" : "s"}, ${warningCount} warning${warningCount === 1 ? "" : "s"}.${suppressedFindings.length > 0 ? ` ${suppressedFindings.length} suppressed.` : ""}${truncated ? ` Truncated at ${MAX_AUDIT_FINDINGS} findings.` : ""}`,
+ debug: { duration_ms: Date.now() - startMs, generated_at: new Date().toISOString() },
+ };
+}
+
+/**
+ * Classify a single markdown link target.
+ * Returns null when the target is out of scope (external URL, anchor, valid
+ * non-klappy path outside writings) — those are not this action's job.
+ */
+function classifyLink(
+ target: string,
+ filePath: string,
+ line: number,
+ isWriting: boolean,
+ uriResolves: (uri: string) => boolean,
+): AuditFinding | null {
+ // Strip fragment for resolution check
+ const bareTarget = target.split("#")[0];
+ if (!bareTarget) return null; // pure anchor link
+
+ if (bareTarget.startsWith("klappy://")) {
+ if (!uriResolves(bareTarget)) {
+ return {
+ rule_id: "dead-reference",
+ severity: "error",
+ location: { path: filePath, line },
+ occurrence: target,
+ message: "URI does not resolve",
+ };
+ }
+ return null;
+ }
+
+ if (isWriting) {
+ if (bareTarget.startsWith("/page/")) {
+ return {
+ rule_id: "legacy-link-pattern",
+ severity: "error",
+ location: { path: filePath, line },
+ occurrence: target,
+ message: "Use a klappy:// URI instead of /page/ path",
+ };
+ }
+ if (bareTarget.startsWith("./") && bareTarget.endsWith(".md")) {
+ return {
+ rule_id: "legacy-link-pattern",
+ severity: "error",
+ location: { path: filePath, line },
+ occurrence: target,
+ message: "Use a klappy:// URI instead of relative .md path",
+ };
+ }
+ }
+
+ // Out of scope: external URLs, mailto, anchors-only, valid non-klappy paths
+ // outside writings, etc.
+ return null;
+}
+
async function runCleanupStorage(
fetcher: KnowledgeBaseFetcher,
knowledgeBaseUrl?: string,
@@ -2935,6 +3232,7 @@
"search",
"get",
"resolve",
+ "audit",
"catalog",
"validate",
"preflight",
@@ -2983,6 +3281,9 @@
case "resolve":
result = await runResolve(input, fetcher, knowledge_base_url, state);
break;
+ case "audit":
+ result = await runAudit(input, fetcher, knowledge_base_url, state);
+ break;
case "catalog":
result = await runCatalog(fetcher, knowledge_base_url, state, { sort_by, limit, offset, filter_epoch });
break;You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 164e69d. Configure here.
…surface CF Preview test 14j (default-scope audit) timed out at 120s on the prior default of [writings/, canon/, odd/, docs/]. Cold-cache fetching ~560 files through the worker's zip-extract path exceeded the curl budget. v1 default scope is writings/ only. Reasons it's honest, not a hack: - PR-2.2's actual cleanup was writings-only; the campaign motivation was reader complaints about broken links in published essays. - April-9 reference-integrity audit classified the 49 unfixed refs as intentional (template placeholders, site routes, historical archive, .cursor/plans) — none in writings/. - writings/ is where authors write klappy:// URIs as body links most often; canon/odd/docs use frontmatter cross-refs which the resolver governs separately. canon/, odd/, docs/ become explicit opt-in via scope.paths. Reversal is one line if a real consumer demonstrates wider need (or if parallelized fetching graduates from the deferred-concerns ledger). Spec amendment to klappy://docs/oddkit/specs/oddkit-audit (v2.1) lands in the sibling canon PR (klappy/klappy.dev#146) so the spec self- documents the deviation rather than the code silently diverging. Refs: - klappy://docs/oddkit/specs/oddkit-audit (DRAFT v2.1 — to be amended) - klappy://docs/planning/link-rot-deferred-concerns (parallelized fetching is a candidate for the deferred ledger)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Phase 2 PR-2.3a of the link-rot-elimination campaign. Implements the
oddkit_auditMCP action perklappy://docs/oddkit/specs/oddkit-audit(DRAFT v2 — KISS, landed in klappy/klappy.dev#142).Walks every markdown file in canon, calls the same supersession-aware index lookup as
oddkit_resolveon eachklappy://link target, emits structured findings for dead references and legacy markdown link patterns. Built for CI use; designed to be the mechanical replacement for the discipline that empirically failed.This is one of two PRs landing the campaign's PR-2.3:
What lands
workers/src/orchestrate.ts—runAuditfunction (~200 LOC) +auditcase in dispatch switch +auditinVALID_ACTIONSarray. Reads only the existing index +fetcher.getFile; no new caches.workers/src/index.ts— standaloneoddkit_audittool definition +auditadded to the unified router's action enum and description.tests/cloudflare-production.test.sh— two new smoke tests (default-scope audit, narrow-scope audit honors paths filter).CHANGELOG.md— 0.26.0 entry with full context, Vodka notes, and explicit list of what's deferred.package.json,workers/package.json, both lockfiles →0.26.0.Behavior contract
OKorFINDINGS{ scope: { paths: ["writings/"] } }OKorFINDINGSklappy://...link in markdowndead-referencefinding (severity:error) if NOT_FOUND or circular[label](/page/...)inwritings/legacy-link-patternfinding (severity:error)[label](./*.md)inwritings/legacy-link-patternfinding (severity:error)<!-- audit-allow: dead-reference reason="..." -->directivesuppressed_findingsfieldBounded by design
MAX_AUDIT_FILES=1000per call (production canon is ~560 docs)MAX_AUDIT_FINDINGS=500per call (summary.truncated: trueflags overflow)Vodka discipline
v1 of the spec proposed four checks (dead-references + terminological-drift + projection-staleness + epoch-gaps) plus a deprecated-terms registry, epoch-completeness rules, and an
audit_allow:frontmatter field. v2 cut to one check, two rule_ids, line-level allowlist only. Cuts captured with explicit revisit triggers inklappy://docs/planning/link-rot-deferred-concerns.Three places updated for the new action surface
Per the lesson encoded in
klappy://canon/constraints/oddkit-action-registration-completeness(landing in the sibling canon PR):handleUnifiedActionVALID_ACTIONSarrayThis is the lesson Cursor Bugbot caught on PR-2.1; this PR proactively respects it.
Release-validation-gate (E0008.3)
This PR introduces a new action surface — load-bearing for the CI gate landing in Phase 3. Per
klappy://canon/constraints/release-validation-gate, an independent Sonnet 4.6 validator should dispatch before promotion to verify:{ status, summary, findings, scope }klappy://URI that doesn't resolve →dead-referencefinding withseverity: error[label](/page/...)pattern in writings/ →legacy-link-patternfinding[label](klappy://valid-uri)→ no finding (no false positives)<!-- audit-allow -->directive suppresses correctly (finding insuppressed_findings, not infindings)Smoke tests in this PR cover (1) and partial (4). Validator should exercise (2)/(3)/(5)/(6) against the live preview after CF auto-deploy.
What this PR does NOT do
.github/workflows/canon-quality.ymlis Phase 3 PR-3.1)Sibling PR
The three bug-class lessons from PR-2.1 (this campaign's resolver implementation) land as canon constraints in a sibling PR in klappy/klappy.dev. Per the campaign sequencing amendment (klappy/klappy.dev#145), changes that touch multiple repos need explicit PR-per-repo. The two PRs are not strictly ordered — the constraints document patterns the audit honors; the audit ships the patterns as code. Either can merge first.
Refs
klappy://docs/oddkit/specs/oddkit-audit(DRAFT v2 — KISS)klappy://docs/oddkit/specs/oddkit-resolve(in prod at v0.25.0)klappy://canon/principles/identity-resolved-by-protocolklappy://docs/planning/link-rot-elimination-campaignklappy://docs/planning/link-rot-deferred-concernsklappy://canon/constraints/release-validation-gate,klappy://canon/principles/vodka-architecture,klappy://canon/principles/ritual-is-a-smellNote
Medium Risk
Adds a new MCP action that scans many markdown files and performs supersession-chain resolution, which could affect worker latency/limits and introduce edge-case false positives/negatives. Existing actions are largely unchanged aside from action registration and input normalization for the new tool.
Overview
Introduces a new
auditcapability (oddkit_auditstandalone tool andoddkitunified action) that scans markdown in a scoped set of paths (defaulting towritings/) and reports structured findings for deadklappy://references (including supersession-chain cycles) and legacy link patterns (/page/...and./*.mdin writings), with optional line-level suppression via<!-- audit-allow: ... reason="..." -->.Wires the new action into the worker router (
VALID_ACTIONS+ dispatch), adds input normalization so individual tools can pass object scope as JSON, and extends the Cloudflare production smoke tests to coveroddkit_auditbasic response shape and scope filtering.Bumps package/worker versions and documents the release as
0.26.0in the changelog.Reviewed by Cursor Bugbot for commit 6bc0595. Bugbot is set up for automated code reviews on this repo. Configure here.