Skip to content

docs(security): document deterministic tool-scanner detect engine (Spec 076 T022)#780

Merged
Dumbris merged 2 commits into
mainfrom
docs/spec076-tool-scanner-mcp3683
Jun 28, 2026
Merged

docs(security): document deterministic tool-scanner detect engine (Spec 076 T022)#780
Dumbris merged 2 commits into
mainfrom
docs/spec076-tool-scanner-mcp3683

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 28, 2026

Copy link
Copy Markdown
Member

Summary

Completes T022 from specs/076-deterministic-tool-scanner/tasks.md, which was not done in the implementation PRs (#769#777). Documentation only — no code changes.

Adds a dedicated docs/features/tool-scanner.md describing the offline detect engine (internal/security/detect/) that powers the built-in tpa-descriptions scanner:

  • The six checksunicode.hidden, shadowing.cross_server, payload.decoded (hard tier); directive.imperative, capability.mismatch, secret.embedded (soft tier), each with what it catches and its FP controls.
  • The two-tier model — hard signals auto-quarantine; soft-only severity is the distinct soft-check count (1→low / 2→medium / 3+→high); independent signals add to confidence and risk score rather than dedup-collapsing.
  • The eval gatescan-eval --gate --min-recall 0.90 --max-fp 0.05, exit code 6 on breach, the hard-negative-only FP gate, and the forward-compatible category gating; plus its blocking CI step in .github/workflows/eval.yml (security-d2).
  • The offline / no-egress guarantee — no network/filesystem/Docker/exec (import-guard enforced), deterministic output, recover()-isolated checks.
  • Normalization rules — raw-text for hidden-Unicode + secrets, normalized text for phrase checks.

Also:

Provenance

Every claim is sourced from the code on main: check IDs/tiers from internal/security/detect/checks/*.go, aggregation from aggregate.go, the gate from cmd/scan-eval/gate.go, CI wiring from .github/workflows/eval.yml, and the offline contract from internal/security/detect/doc.go.

Testing

Docs-only change (exempt from TDD per CLAUDE.md). Pre-commit hooks (trailing whitespace, EOF, merge-conflict, gofmt) passed. No Go code touched.

Related: Spec 076 — MCP-3683

… (Spec 076 T022)

Adds docs/features/tool-scanner.md covering the offline detect engine behind
the built-in tpa-descriptions scanner:

- the six checks (unicode.hidden / shadowing.cross_server / payload.decoded —
  hard tier; directive.imperative / capability.mismatch / secret.embedded —
  soft tier)
- the two-tier model (hard auto-quarantines; soft severity = distinct soft-check
  count 1->low/2->medium/3+->high; consensus adds to confidence/risk score)
- the eval gate (scan-eval --gate --min-recall 0.90 --max-fp 0.05, exit 6 on
  breach) and its blocking CI wiring in .github/workflows/eval.yml
- the offline / no-egress guarantee (no I/O, deterministic, recover-isolated)
- normalization rules (raw-text hidden-Unicode + secrets, normalized phrases)

Also expands the tpa-descriptions row in security-scanner-plugins.md to point
at the new page, links it from Related reading, registers it in the docs
sidebar, and checks off T013-T019 + T022 in the Spec 076 tasks checklist.

Docs-only change (exempt from TDD per CLAUDE.md). No code touched.

Related: Spec 076 (specs/076-deterministic-tool-scanner)
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 28, 2026

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: a59b4f1
Status: ✅  Deploy successful!
Preview URL: https://dfd6f137.mcpproxy-docs.pages.dev
Branch Preview URL: https://docs-spec076-tool-scanner-mc.mcpproxy-docs.pages.dev

View logs

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: docs/spec076-tool-scanner-mcp3683

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (25 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 28316259234 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

CodexReviewer review of #780: the docs overstated that tpa-descriptions is
purely the new two-tier detect engine. The live scanner
(internal/security/scanner/inprocess.go) still appends the legacy TPA keyword
rules (tpa_hidden_instructions / prompt_injection_in_description /
data_exfiltration_in_description) after the detect-engine findings, and those
are ThreatLevelDangerous — they block security approve and drive the summary
to dangerous (confirmed by e2e_tpa_smoke_test.go).

Documents the current coexistence accurately:
- tool-scanner.md: scope note on the two-tier table + a new "Coexistence with
  the legacy TPA rules" subsection + a plug-in-section pointer; the
  "soft never auto-quarantines" rule is the detect-engine's, not the legacy
  rules'.
- security-scanner-plugins.md: tpa-descriptions row notes the still-active
  dangerous legacy rules.

Folding the legacy rules into the detect engine remains a separate
implementation change (out of scope for this docs PR).

Related: Spec 076 (specs/076-deterministic-tool-scanner)

Co-Authored-By: Paperclip <noreply@paperclip.ing>

@mcpproxy-gatekeeper mcpproxy-gatekeeper Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gatekeeper approval — Codex review verdict: ACCEPT.

This approval is posted automatically by the MCPProxy Gatekeeper App on behalf of the Codex reviewer (verdict of record lives in the Paperclip review thread). Author≠approver satisfied; QA + CI gates enforced separately.

Auto-approved per Model B (MCP-1249).

@Dumbris Dumbris merged commit fe0304b into main Jun 28, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants