Problem
Harden has anti-patterns that describe audit failures, but not patterns that detect agent self-deception during execution. Agents rationalize skipping steps with plausible-sounding excuses that slip past current checks.
Examples of agent rationalization:
- "I already tested it manually" → skipped automated verification
- "This is straightforward enough to not need tests" → skipped TDD
- "The output looks reasonable" → no oracle verification
- "I'll come back to this later" → deferred and forgotten
Proposed Solution
Add explicit anti-rationalization patterns to harden's audit that catch:
- Claims of completed work without evidence (logs, test output, screenshots)
- Substitution of easier tasks for harder required tasks
- Deferral language that indicates skipped steps
- "Looks good/reasonable" assessments without comparison to an independent source
Likely Affected Files/Modules
SKILL.md — new patterns in anti-pattern section or as a detection dimension within existing passes
Acceptance Criteria
Problem
Harden has anti-patterns that describe audit failures, but not patterns that detect agent self-deception during execution. Agents rationalize skipping steps with plausible-sounding excuses that slip past current checks.
Examples of agent rationalization:
Proposed Solution
Add explicit anti-rationalization patterns to harden's audit that catch:
Likely Affected Files/Modules
SKILL.md— new patterns in anti-pattern section or as a detection dimension within existing passesAcceptance Criteria