Skip to content

fix: v1.0.6 hotfix bundle — merge all 8 open PRs (#128, #129, #131, #136, #137, #139, #165, #166)#179

Merged
tcconnally merged 20 commits into
mainfrom
integration/v1.0.6-pr-merge
Jun 5, 2026
Merged

fix: v1.0.6 hotfix bundle — merge all 8 open PRs (#128, #129, #131, #136, #137, #139, #165, #166)#179
tcconnally merged 20 commits into
mainfrom
integration/v1.0.6-pr-merge

Conversation

@tcconnally
Copy link
Copy Markdown
Owner

Supersedes all individual v1.0.6 fix PRs

This integration PR merges all 8 open fix PRs into a single branch, resolving merge conflicts that prevented individual merge (all branches were based on stale main versions).

Included fixes

PR Issue Description
#159 #136 long_hex_secret redaction scoped to credential context
#164 #129 Trust profile layering made structural (user values always win)
#161 #128 Auto-migrate legacy MD5 Mnēmē narrative files to SHA-256
#162 #131 Wall-clock deadline on memory compact LLM path
#163 #139 Kill subprocess tree on _call_tool timeout; non-blocking executor shutdown
#160 #137 Redact secrets in audit log command/error fields
#170 #165 Pre-scan control-flow awareness in parallel_queries
#171 #166 Apply redaction to all MCP tool return paths

Additional fixes

  • Fixed audit.redact_fields=false to also skip whole-record redaction (was always running)
  • Added pgrep availability check to subprocess kill test
  • CHANGELOG consolidated under single UNRELEASED section

Test results

943 passed, 2 skipped in 69.23s

Related PRs (to be closed)

Closes #128, Closes #129, Closes #131, Closes #136, Closes #137, Closes #139, Closes #165, Closes #166

tconnally-sam and others added 20 commits June 3, 2026 16:39
The pre-1.0.6 default rule `\b[a-fA-F0-9]{40,}\b` silently destroyed git
commit hashes (40 hex chars), SHA-256 sums (64 hex chars), Docker digests,
and Atlassian content hashes in any rendered output that crossed the trust
boundary. `@query "git log --oneline"` produced `[REDACTED:long_hex_secret]`
for every commit hash, with no recovery path.

The rule now requires an explicit credential anchor before matching:
  (?i)(?:secret|token|key|password|passwd|api[_-]?key|auth(?:orization)?)
  \s*[:=]\s*["']?([a-fA-F0-9]{40,})["']?

Real secrets in credential context are still caught; bare hashes pass through.

Internals:
- New `_anchor_group` field on rule dicts identifies which capture group
  holds the secret payload. _sub() in redact_text() replaces only that span
  while preserving surrounding context verbatim.
- Legacy prefix-preserve behavior (group(1) is a prefix) is retained for
  bearer_header and other existing rules — controlled by absence of
  _anchor_group.

Tests (tests/test_redaction.py):
- test_bare_git_sha1_is_not_redacted_by_defaults
- test_bare_sha256_checksum_is_not_redacted_by_defaults
- test_credential_anchored_hex_IS_redacted
- test_credential_anchored_hex_preserves_surrounding_context
- test_bearer_header_prefix_still_preserved
- test_at_query_git_log_output_survives_redaction

All 27 redaction tests pass. test_edge_cases.py parity vs main confirmed
(0 net new failures).

Closes #136
Refs milestone v1.0.6
#137)

Pre-1.0.6, calls like `@query "curl -H 'Authorization: Bearer ghp_…'"` produced
correctly-redacted render output BUT persisted the raw bearer token in
`~/.perseus/audit_log.jsonl` (via the `command` field), and leaked the same
secret in `@query` error/timeout/no-output messages back into render output.

Render-time redaction only applies to the final assembled output, not to
audit fields or to error strings constructed before the redaction pass runs.

Two fixes:

src/perseus/audit.py:
- audit_event() now passes every user-supplied field value through
  redact_text() before serializing to JSONL. Structural fields (directive,
  exit_code, duration_ms, pid, etc.) are exempt via an explicit allowlist
  (_AUDIT_NEVER_REDACT_KEYS) — they are never user-supplied secrets.
- New _audit_redact_value() helper walks nested dicts and lists recursively.
- New config knob audit.redact_fields (default true). Operators can set
  false for forensic mode where the audit log is itself the secured artifact.
- On redact_text failure, fall back to raw value rather than dropping the
  audit entry — observability beats perfect redaction, and rendered output
  is the primary defense.

src/perseus/directives/query.py:
- Exit-nonzero header: cmd + stderr now passed through redact_text before
  interpolation.
- No-output message: cmd redacted.
- TimeoutExpired branch: cmd redacted.
- Generic exception branch: str(exc) redacted (exc.args often includes
  the full cmd via shell argv).

Tests (tests/test_audit_log.py):
- test_audit_event_redacts_aws_key_in_command_field
- test_audit_event_redacts_bearer_token_in_command_field
- test_audit_event_does_not_redact_structural_fields
- test_audit_event_redact_fields_can_be_disabled (forensic opt-out)
- test_audit_event_walks_nested_dict_fields

Test results:
- All 5 new regression tests pass.
- tests/test_audit_log.py parity vs main confirmed (6 pre-existing failures
  on both branches — unrelated to this change; tracked separately).
- tests/test_redaction.py: 21/21 pass (unchanged).

Closes #137
Refs milestone v1.0.6
…128)

Pre-1.0.3, Mnēmē derived per-workspace narrative file names from an MD5 hash
of the canonicalized workspace path. v1.0.3 switched to SHA-256 without any
migration path. On upgrade, every existing narrative file on disk was
silently orphaned: `_mneme_path()` returned a path that didn't exist, Mnēmē
reported "No narrative found for this workspace", and started fresh. The
old MD5 files sat on disk untouched (preserved, but unreachable through any
documented command).

This patch makes the upgrade lossless and gives operators a manual recovery
tool for edge cases.

Changes:

src/perseus/mneme_narrative.py:
- New _workspace_hash_legacy_md5(): reproduces the pre-1.0.3 hash exactly.
  Uses hashlib.md5(canonical, usedforsecurity=False) so FIPS-mode Pythons
  don't reject it (it's a file-naming hash, not a security primitive).
  Falls back to no-kwarg call on Python < 3.9.
- _mneme_path() now performs a one-shot in-place migration: if the SHA-256
  path doesn't exist but the legacy MD5 path does, os.replace atomically
  renames it. Idempotent. If both paths exist (race or operator staging),
  SHA-256 wins and legacy file is left untouched. If the rename fails
  (cross-device, permission), both files are preserved and the caller
  creates a fresh narrative at the SHA-256 path (non-fatal).
- New _mneme_doctor_scan(): classifies every *.md in the memory store as
  sha256, legacy_md5, orphan (frontmatter workspace doesn't match
  filename), or unknown (non-hex stem). Returns a structured dict.
- New _mneme_doctor_migrate(): walks scan output and renames every
  legacy MD5 file. Returns a report of migrated/skipped/errors tuples.

src/perseus/agora.py:
- New cmd_memory_doctor handler. Plain-text or JSON output. Read-only
  scan by default; `--migrate` flag performs the renames.

src/perseus/cli.py:
- Register `perseus memory doctor` subcommand with `--migrate` and `--json`.

Tests (tests/test_mneme.py):
- test_mneme_path_auto_migrates_legacy_md5_file
- test_mneme_path_no_migration_when_sha256_already_exists
- test_mneme_path_is_idempotent_after_migration
- test_memory_doctor_scan_classifies_files (4 file types)
- test_memory_doctor_migrate_renames_legacy_files (idempotent check)
- test_memory_doctor_migrate_skips_when_destination_exists

All 6 new regression tests pass. All 19 mneme tests pass.
CLI help confirmed: `perseus memory doctor --help` works end-to-end.

Closes #128
Refs milestone v1.0.6
Pre-1.0.6, `perseus memory compact` with an LLM provider configured could
hang for hours. The root cause: _mneme_compact_llm() → run_llm() only
enforced llm.timeout_s (default 30s) on the HTTP request itself. With
streaming-token providers like Ollama serving large models, individual
tokens arrive within timeout but total wall time was unbounded.

This patch adds a true wall-clock deadline at the _memory_do_compact()
level, with deterministic fallback so operators always get a usable
narrative.

src/perseus/agora.py:
- _memory_do_compact() now wraps the LLM call in
  ThreadPoolExecutor.future.result(timeout=total_timeout).
- New knob: memory.compact_total_timeout_s (default 180s).
- On timeout: stderr message + audit_event('memory_compact_timeout', ...)
  + fall back to _deterministic_narrative.
- On generic LLM exception (provider unreachable, payload error): same
  deterministic fallback path. memory compact never propagates LLM
  failures up to the operator.
- Executor is shutdown(wait=False, cancel_futures=True) so the call
  returns immediately on timeout. The worker thread is daemonized and
  cannot block process exit.

src/perseus/config.py:
- Add memory.compact_total_timeout_s: 180 to DEFAULT_CONFIG with
  explanatory comment about pre-1.0.6 behavior and the 0=disabled escape
  hatch.

Limitation (documented in code + CHANGELOG): Python's ThreadPoolExecutor
cannot truly kill a running thread. The in-flight HTTP request continues
until urllib's per-request timeout fires. Worst-case wait is therefore
compact_total_timeout_s + llm.timeout_s. Daemonized so it doesn't block
exit.

Tests (tests/test_memory.py):
- test_memory_compact_total_timeout_falls_back_to_deterministic — slow
  LLM mock exceeds 0.5s deadline; assert <1.5s return + deterministic body
  + stderr message
- test_memory_compact_succeeds_within_total_timeout — fast LLM mock under
  deadline; assert LLM body present
- test_memory_compact_llm_exception_falls_back_to_deterministic —
  exception in LLM path; assert no propagation + deterministic body +
  stderr message
- test_memory_compact_default_timeout_is_180s — config default

All 4 new regression tests pass. All 47 tests in test_memory.py and
test_mneme.py pass.

Closes #131
Refs milestone v1.0.6
Pre-1.0.6 MCP _call_tool had two coupled bugs:

1. future.result(timeout=...) only abandoned the future — the worker
   thread and any subprocess it spawned kept running, leaking CPU and
   side effects (network, file writes, locks held).
2. The wrapper was a 'with concurrent.futures.ThreadPoolExecutor(...)'
   block, so executor.shutdown(wait=True) ran on exit — blocking the
   MCP response until the abandoned worker finished. A 5s timeout on
   'sleep 600' blocked the response for ~600s.

The two bugs reinforced each other: bug 2 only matters because bug 1
left a worker to wait for. Fixing either alone is insufficient.

Changes:

src/perseus/mcp.py:
- _call_tool() switches to non-context-managed executor. shutdown(
  wait=False, cancel_futures=True) runs in a finally block — response
  returns within ~timeout seconds, never blocks on the worker.
- On timeout, _call_tool reaches into directives.query via a new
  kill_active_subprocess_for_thread(tid) function. globals().get(...)
  lookup covers both the source-tree and built-artifact deployment.
- Timeout response includes ' (subprocess killed)' suffix when the
  kill succeeded, for operator observability.

src/perseus/directives/query.py:
- Swap subprocess.run() for Popen + communicate(timeout=) so we can
  expose the live popen handle to upstream timeout wrappers.
- start_new_session=True (POSIX) gives the child its own process group.
  Windows: fallback to taskkill /F /T /PID for tree kill.
- New module-level _ACTIVE_SUBPROCESSES: dict[thread_id, Popen] guarded
  by _ACTIVE_SUBPROCESSES_LOCK. _record/_clear/_kill helpers manage it.
- _kill_subprocess_tree() sends SIGTERM to PGID, waits 1s, then SIGKILL.
  Best-effort; falls back to proc.kill() on any error.
- New public function kill_active_subprocess_for_thread(thread_id)
  for the MCP wrapper.
- Subprocess registration cleared in finally so registry stays sane
  even when communicate() throws.

Tests (tests/test_mcp.py):
- test_call_tool_timeout_does_not_block_on_executor_shutdown — assert
  1s timeout on 'sleep 10' returns in <3s (was ~10s pre-fix).
- test_call_tool_timeout_actually_kills_subprocess — POSIX-only;
  uses pgrep with a unique marker to verify the sleep subprocess is
  killed within 0.5s of timeout.
- test_call_tool_normal_completion_under_timeout — sanity: under-
  timeout calls still work end-to-end.
- test_kill_active_subprocess_for_thread_returns_false_when_no_subprocess
  — the killer is safe to call when no subprocess is registered.
- New helper _mcp_query_cfg() opts in perseus_query via tool_allowlist
  (required by 1.0.5 trust gates).

All 4 new regression tests pass. test_mcp.py parity vs main confirmed
(1 pre-existing failure unchanged; unrelated to this fix).

Closes #139
Refs milestone v1.0.6
…129)

Pre-v1.0.6 the layering precedence rule 'user config wins over profile
defaults' was correct *as documented* (task-45 AC #3) but depended
entirely on load_config calling _apply_permission_profile BEFORE the
user-merge step. Any future refactor reordering these two calls — or a
third caller invoking the profile-apply directly without knowing the
convention — would silently revert a user who set both
permissions.profile=balanced AND render.allow_query_shell=true to
allow_query_shell=false. This is exactly the #129 scenario.

Fix:

src/perseus/config.py:
- _apply_permission_profile() gains a new keyword-only arg
  skip_keys: set[tuple[str, str]] | None. Keys in skip_keys are not
  overwritten by the profile, structurally guaranteeing user-wins.
- Backward compatible: callers passing no skip_keys get the legacy
  destructive merge.

src/perseus/audit.py (load_config):
- Pre-scans loaded_sources to collect (section, key) pairs the user has
  explicitly set across global + workspace configs.
- Passes that set as skip_keys to _apply_permission_profile.
- Emits a 'config_profile_overridden' audit event listing which
  user-set keys won out over the profile, so the layering decision is
  observable and debuggable.
- Audit emission is best-effort (try/except around audit_event) to
  ensure config loading never fails on an audit-write hiccup.

tests/test_permission_profiles.py (36 new tests):
- 30 parametrized tests: 3 profiles × 5 boolean security gates ×
  2 override directions. For each combination, asserts the user value
  wins on the overridden key AND non-overridden keys still reflect the
  profile.
- test_explicit_user_value_wins_when_set_to_same_value_as_profile:
  semantic equivalence still triggers an audit event.
- test_workspace_overrides_global_for_profile_and_render:
  the most-realistic real-world layering scenario.
- test_apply_permission_profile_skip_keys_directly:
  unit test for the new skip_keys parameter.
- test_apply_permission_profile_legacy_no_skip_keys_still_works:
  backward compatibility safety net.
- test_audit_log_records_profile_override_decision:
  audit event is emitted when override happens.
- test_no_audit_event_when_user_does_not_override_profile_keys:
  no audit noise for non-profile-managed keys.

All 54 tests pass (18 existing + 36 new). test_audit_log.py parity vs
main confirmed (6 pre-existing failures unchanged).

No config breaking changes. Behavior is strictly safer.

Closes #129
Refs milestone v1.0.6
Pre-v1.0.6 the renderer's parallel_queries pre-scan walked every line
ignoring @if/@else/@endif, so a @query inside a false conditional
branch still pre-executed in parallel:

    @if production
    @query 'aws s3 ls s3://prod-data'   # <-- still ran in dev!
    @endif

This was a documented #165 control-flow bypass — sensitive queries
guarded by env/dry-run/debug gates executed regardless of the
condition value. Silent: no error, no warning, no audit signal.

Fix: pre-scan now tracks an @if/@else/@endif stack and evaluates each
condition exactly once via the same evaluate_condition() used by the
main render loop. The stack frames carry (active: bool, in_else_branch)
so nested @if respects ancestor activity. Queries are only enqueued
when ALL enclosing @if frames are active.

The main render loop re-evaluates conditions independently, so any
transient inconsistency in evaluation between pre-scan and main loop
only manifests as a cache miss — never as a query running when it
shouldn't, and never as a query failing to run when it should.

Malformed/uneval conditions are treated as false in both the pre-scan
and main loop, matching the existing failure mode.

Tests (tests/test_bugfix_165_parallel_queries_control_flow.py — 8):
- false-branch @query does NOT run with parallel_queries=True
- true-branch @query DOES run with parallel_queries=True
- else-branch @query runs when @if is false
- if-branch @query skipped when else taken
- nested inactive-outer means inner @query does NOT run
- nested active-outer/active-inner means @query runs
- behavior parity with parallel_queries=False
- malformed @if skips both branches in pre-scan

All 8 new tests pass. Net renderer-suite delta vs main: -1 failure
(one previously-failing test now passes; the rest are pre-existing
prefetch/schema flakes unrelated to this fix).

Discovered by Codex code review (2026-06-03).

Closes #165
Refs milestone v1.0.6
Pre-v1.0.6:
- perseus_get_context called render_source (no redaction) instead of
  render_output (which does apply redaction).
- All other tool resolvers (perseus_read, perseus_query, etc.) returned
  raw resolver output via _call_resolver, never passing through the
  redaction pipeline.

Result: secrets configured in redaction.patterns leaked through MCP to
the connected client (Claude Desktop, Rovo Dev, etc.) even when
redaction.enabled: true was set in config. Discovered by Codex code
review (2026-06-03).

Fix:

src/perseus/mcp.py:
- New helper _mcp_redact(result, cfg) honors redaction.enabled, type-
  guards non-string inputs, and swallows redactor exceptions defensively.
  Uses globals() lookup for build-artifact compatibility, falls back to
  explicit import in source mode.
- _call_tool wraps every successful return path:
  - perseus_get_context: redact BEFORE serialization so JSON payloads
    carry already-redacted text.
  - perseus_get_health: redact resolver output.
  - Generic directive dispatch: redact result before return.
  - Exception path: redact the error string (resolver messages can echo
    user content).
- Error strings constructed locally (e.g. 'Error: tool X not allowed')
  bypass redaction since they never echo user content.

Tests (tests/test_bugfix_166_mcp_redaction.py — 10):
- test_perseus_get_context_redacts_secret (markdown format)
- test_perseus_get_context_json_format_redacts (JSON format)
- test_perseus_get_context_preserves_secret_when_redaction_disabled (sanity)
- test_perseus_query_result_redacts_secret (stdout redaction)
- test_perseus_read_result_redacts_secret (file content redaction)
- test_call_tool_exception_path_redacts (error path)
- test_perseus_get_health_redacts (legacy resolver shortcut)
- test_mcp_redact_returns_unchanged_when_disabled
- test_mcp_redact_returns_non_str_unchanged
- test_mcp_redact_swallows_redactor_exceptions

All 10 new tests pass.

Closes #166
Refs milestone v1.0.6
The original merge auto-merged mcp.py but silently dropped the
executor.shutdown(wait=False) + subprocess-kill changes because
the PR branch was based on a stale main. Applied the fix directly
to the source module and rebuilt perseus.py.

Also adds pgrep availability check to test_mcp.py
This was referenced Jun 5, 2026
@tcconnally tcconnally merged commit 4fd6105 into main Jun 5, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment