Spun out from PR #70 (Phase 3: detection rules first-class), Phase 4
performance benchmark findings #1 and #2.
Finding #1 — sync=True is the bulk-ingest bottleneck
Both Sigma and YARA ingest currently call mm.remember(..., sync=True)
so the note is persisted + vector-indexed + enrichment-flushed inline
before the caller gets control back. Under bulk ingest (SigmaHQ ~3k
rules, CCCS-Yara ~400 rules) this is the dominant cost.
Phase 4 bench numbers (paste from the perf report when wiring this up):
- Sigma:
ingest_rules_dir on 4 fixtures — ~4s wall, ~95% in
remember(sync=True).
- YARA: same pattern — parse is microseconds, persistence is seconds.
Finding #2 — YARA p95 plyara tail
plyara.Plyara().parse_string has a fat tail under repeated
invocations on large rule files. p50 is fine; p95 can exceed p50 by 10x
on ~50kB multi-rule files (observed on the 3 CCCS-Yara fixtures under a
tight loop).
Ask
Benchmark mm.remember(..., sync=False) + explicit mm.flush() at
end of ingest_rules_dir, and/or introduce a bulk=True path on
MemoryManager that defers the vector index write. Add a CI bench (pytest
bench plugin is OK) that fails if p95 exceeds a threshold on the
fixtures tree.
For plyara: cache the Plyara() instance per directory walk (it's
already created per-call, so we're paying grammar-compile cost 400x on
CCCS-Yara). Confirm thread-safety before sharing across workers.
Deliberately NOT in PR #70
Changing the sync/async boundary risks regressions in existing ingest
paths (OpenCTI sync, enrichment worker). Wants its own PR with real
before/after numbers and a regression bench.
Spun out from PR #70 (Phase 3: detection rules first-class), Phase 4
performance benchmark findings #1 and #2.
Finding #1 — sync=True is the bulk-ingest bottleneck
Both Sigma and YARA ingest currently call
mm.remember(..., sync=True)so the note is persisted + vector-indexed + enrichment-flushed inline
before the caller gets control back. Under bulk ingest (SigmaHQ ~3k
rules, CCCS-Yara ~400 rules) this is the dominant cost.
Phase 4 bench numbers (paste from the perf report when wiring this up):
ingest_rules_diron 4 fixtures — ~4s wall, ~95% inremember(sync=True).Finding #2 — YARA p95 plyara tail
plyara.Plyara().parse_stringhas a fat tail under repeatedinvocations on large rule files. p50 is fine; p95 can exceed p50 by 10x
on ~50kB multi-rule files (observed on the 3 CCCS-Yara fixtures under a
tight loop).
Ask
Benchmark
mm.remember(..., sync=False)+ explicitmm.flush()atend of
ingest_rules_dir, and/or introduce abulk=Truepath onMemoryManager that defers the vector index write. Add a CI bench (pytest
bench plugin is OK) that fails if p95 exceeds a threshold on the
fixtures tree.
For plyara: cache the
Plyara()instance per directory walk (it'salready created per-call, so we're paying grammar-compile cost 400x on
CCCS-Yara). Confirm thread-safety before sharing across workers.
Deliberately NOT in PR #70
Changing the sync/async boundary risks regressions in existing ingest
paths (OpenCTI sync, enrichment worker). Wants its own PR with real
before/after numbers and a regression bench.