Skip to content

Tighten CCCS metadata regexes (SEC-6 / SEC-7) #73

@rolandpg

Description

@rolandpg

Spun out from PR #70 (Phase 3: detection rules first-class), Phase 4
security review items SEC-6 and SEC-7.

Finding

The CCCS metadata validator at
src/zettelforge/yara/cccs_metadata.py derives allowed patterns from
the vendored CCCS_YARA_values.yml. Two issues:

  • SEC-6: Some field regexes are anchored loosely enough that a value
    like TLP:WHITE\nid: hostile_id matches the sharing pattern
    because the regex is not bracketed ^...$. Validator accepts
    multi-line values that would never survive the real upstream rule
    grammar.
  • SEC-7: The author regex permits the full set of RFC-5322 local
    parts including quoted strings; operators have observed attacker input
    embedded in these fields. Tighten to [A-Za-z0-9_.@+-]+ unless a
    stricter standard is cited upstream.

Ask

  • Audit every regex derived from _allowed_regexes — anchor with
    ^...$ and normalise whitespace handling.
  • Decide per-field policy: strict-ASCII vs. Unicode-word-chars.
  • Add targeted tests with each malicious input pattern as a negative
    assertion.

Deliberately NOT in PR #70

Tightening the validator risks rejecting benign real-world CCCS rules
that have been silently accepted to date. Wants its own PR with a pass
through the CCCS-Yara corpus to confirm the strictness bump doesn't
regress downstream ingest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions