Local-hardening: encrypt vault, remove /audit reverse-lookup, loopback bind, outbound allowlist#7
Open
Arkessiah wants to merge 1 commit into
Open
Conversation
…ound allowlist Make DontFeedTheAI a purely local tool. Real data never leaves the machine and is never exposed over HTTP; only masked surrogates cross the boundary. - Encrypt the surrogate vault at rest (Fernet / AES-128 + HMAC), key derived from VAULT_KEY via PBKDF2. Keyed HMAC blind index dedups values without storing cleartext. verify.db encrypted the same way. Fail-closed: missing key → proxy refuses to start; wrong key → canary abort. (src/crypto.py) - Remove the /audit dashboard and the transform_log table behind it — the surrogate→original reverse-lookup is no longer served over HTTP. - Bind to loopback by default (HOST=127.0.0.1); compose publishes ports on 127.0.0.1 only. - Restrict outbound traffic to the configured LLM upstreams via an allowlist; every other destination is blocked. (src/netguard.py) - Harden /health so it never exposes vault contents or counts. - Update docs, .env.example and tests; add encryption tests. Tests: 149 passed, 55 skipped (Ollama integration).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR hardens DontFeedTheAI for local, single-operator use: it removes the
HTTP surface that exposes the reverse-lookup table, encrypts the vault at rest,
binds to loopback by default, and restricts outbound traffic to the configured
LLM upstreams. Only masked surrogates ever cross the network boundary; the real
data never leaves the machine and is never readable over HTTP.
Two of these changes implement items already on your own roadmap (see
docs/threat-model.md: "Write-only vault — the reverse lookup is never exposedover HTTP", and the README note that making
/auditwrite-only is planned).Motivation
/auditexposes the fullsurrogate → originaltable over HTTP — and themiddleware allows
/auditeven whenPROXY_SECRETis set. Anyone who canreach the port can reverse the entire anonymization for the engagement. This
is the single highest-impact risk and was already acknowledged as a roadmap
item.
hostnames). A stolen
data/directory means full exposure across everyengagement — a poor fit for the NDAs most pentest work operates under.
0.0.0.0, exposing the proxy (and the audit table) to thewhole LAN.
from turning the data-holding proxy into an exfiltration channel.
What changed
Encrypted vault at rest (
src/crypto.py, new)from a
VAULT_KEYpassphrase via PBKDF2 (200k iterations). The passphrase isnever written to disk — only a non-secret salt and an encrypted canary live in
the DB.
verify.db(background verifier) encrypts itsoriginalcolumn the same way.VAULT_KEY→ the proxy refuses to start; a wrongpassphrase aborts on the canary check instead of silently diverging.
Remove the HTTP reverse-lookup
/auditdashboard and thetransform_logtable that backed it.The
surrogate → originalmapping is no longer served over HTTP at all./healthno longer returns vault contents or counts.Local-only networking
HOST=127.0.0.1; compose files publish ports on127.0.0.1only.src/netguard.py): the proxy only connects to theconfigured Anthropic/OpenAI/OpenRouter upstreams and local Ollama; any other
destination is blocked. Extra hosts via
UPSTREAM_ALLOWLIST.Breaking changes — happy to gate these
I kept this branch fail-closed because it targets a strict local setup, but I
understand the VPS/tunnel flow is your recommended mode. I'm glad to rework any
of the breaking parts behind opt-in flags so existing users aren't affected:
VAULT_ENCRYPTION=true) with aone-time migration for existing plaintext vaults.
HOST, but I can keep0.0.0.0as thedefault and just document the loopback recommendation.
/auditremoval could instead become write-only (show entity types andcounts, never the original values), matching the roadmap wording more directly.
I'm also happy to split this into smaller, focused PRs (e.g. audit-hardening
first, then encryption) if that's easier to review.
Testing
149 passed, 55 skipped(the skips are the Ollama integration tests). Added unittests covering encryption round-trip, no-cleartext-at-rest, and wrong-key
rejection. Mask → forward → unmask round-trips verified end to end.