Skip to content

Bug: long_hex_secret redaction rule corrupts git hashes in rendered output #136

@tcconnally

Description

@tcconnally

Severity: 🔴 Critical (data destruction in user-facing output)

The default redaction rule long_hex_secret in src/perseus/redaction.py:47 matches any 40+ character hex string:

{"name": "long_hex_secret", "pattern": r"\b[a-fA-F0-9]{40,}\b"}

This silently matches:

  • Git SHA-1 commit hashes (exactly 40 hex chars)
  • Git SHA-256 commit hashes (64 hex chars)
  • File checksums, Docker digests, Atlassian content hashes
  • Output of git log, git rev-parse, git ls-tree, git show

Any @query "git log --oneline", @waypoint, or @perseus directive whose output crosses Perseus's trust boundary has every commit hash replaced with [REDACTED:long_hex_secret]. This silently destroys forensically and operationally important data with no recovery path.

Repro

cd $(mktemp -d) && git init -q && echo a > a && git add . && git -c user.email=a@b -c user.name=a commit -qm a
mkdir .perseus
echo '@query "git log --format=%H"' > .perseus/context.md
perseus render
# Expected: 40-char hash
# Actual:   [REDACTED:long_hex_secret]

Suggested fix

Remove the rule, OR scope it to a credential context:

{"name": "long_hex_secret",
 "pattern": r"(?i)(?:secret|token|key|password|passwd|api[_-]?key)\s*[:=]\s*[a-f0-9]{40,}"}

Acceptance criteria

  • Regression test in tests/test_redaction.py: a bare 40-char hex git hash is NOT redacted by defaults.
  • Regression test: secret_key=<40-hex> IS redacted.
  • CHANGELOG.md note for the redaction rule change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions