Skip to content

feat(rule_format): add Agent Threat Rules (ATR) format adapter#50

Draft
eeee2345 wants to merge 1 commit into
rulezet:mainfrom
eeee2345:feat/atr-format
Draft

feat(rule_format): add Agent Threat Rules (ATR) format adapter#50
eeee2345 wants to merge 1 commit into
rulezet:mainfrom
eeee2345:feat/atr-format

Conversation

@eeee2345
Copy link
Copy Markdown

Implements the RuleType abstract contract for ATR (Agent Threat Rules), per discussion with @ecrou-exact in #49.

Files

  • app/features/rule/rule_format/available_format/atr_format.py (405 lines) — ATRRule class
  • tests/rules/test_atr_format.py (353 lines) — 24 unit tests, all passing

Diff: 758 insertions, 0 modifications to existing files.

Method coverage

All four required RuleType abstract methods + the detect() helper:

Method Status
format property "atr"
get_class() "ATRRule"
validate(content) Two-layer: YAML parse + ATR semantic (canonical id pattern, 10-category enum, 4-severity enum, agent_source.type enum, detection block shape)
parse_metadata(content, info) rulezet-canonical shape; preserves ATR-YYYY-NNNNN in original_uuid
get_rule_files(file) .yaml / .yml extension gate
extract_rules_from_file(filepath) Single-rule (verbatim, no re-dump), list-of-rules, multi-document YAML
find_rule_in_repo(repo_dir, rule_id) Match by stable original_uuid or title
detect(content) Disambiguation helper for the format loader — triggers on ATR-specific markers

The detect() method is implemented as a regular instance method (not @abstractmethod), so the existing RuleType contract in rule_type_abstract.py is unchanged. If/when the abstract grows a detect() method per the discussion in #49, this implementation already satisfies it.

Style alignment with sigma_format.py

  • Section header comment blocks (# VALIDATION #, # META PARSING #, etc.)
  • Module imports in the same order (stdlib → third-party → app)
  • Error-path return shape from parse_metadata mirrors sigma's "Metadata Error" fallback
  • detect_cve usage matches sigma's contract: cve_id field is a JSON-encoded string, not a list, so downstream consumers don't need to branch on format
  • Single-rule files returned verbatim in extract_rules_from_file (no yaml.safe_dump) — preserves quoting / key-ordering like sigma does

ATR semantic validation

Beyond YAML parsing, validate() enforces:

  • id matches ^ATR-\d{4}-\d{5}$
  • severity{critical, high, medium, low}
  • tags.category ∈ the canonical 10-category ATR taxonomy
  • agent_source.type recognised (additive — unknown types are warnings, not errors)
  • detection.conditions is a non-empty array, each entry has field / operator / value
  • detection.condition{any, all, or, and}

Output: ValidationResult(ok, errors, warnings, normalized_content). normalized_content always returns the original input verbatim — never re-dumped.

Tests

$ pytest tests/rules/test_atr_format.py -v
============================== 24 passed in 0.39s ==============================

Covers: format / class identifier, detect() (3 positive + 3 negative cases), validate() (canonical rule accepted + 7 distinct rejection paths), parse_metadata() (explicit CVE list, description fallback, tags flattening, error path), get_rule_files(), extract_rules_from_file() (single-rule verbatim, non-ATR-YAML returns empty, multi-doc).

Tests run inside the existing rulezet test environment (full requirements.txt + Flask app conftest). No additional dependencies introduced.

Auto-registration

Per load_all_rule_formats() in rule_type_abstract.py, dropping the adapter into available_format/ is sufficient — no manual registry edit needed.

On the future rulezet-cast migration

@ecrou-exact mentioned in #49 that the format system is being lifted out to a separate repo (rulezet/rulezet-cast). This adapter is structured to migrate cleanly: zero modifications to existing rulezet-core files, full self-containment of the format logic, and no implicit dependencies on rulezet's internal data model beyond the documented RuleType contract.

Upstream

  • ATR repo: https://github.com/Agent-Threat-Rule/agent-threat-rules (MIT, v2.1.2 / 338 rules across 10 attack categories)
  • MISP integration: taxonomies#323 + galaxy#1207, both merged 2026-05-10 by @adulau as MISP project lead
  • Production deployments: Cisco AI Defense skill-scanner, Microsoft Agent Governance Toolkit (with weekly auto-sync)

Closes #49.

Implements the RuleType abstract contract for the ATR detection-rule
standard, per discussion with @ecrou-exact in rulezet#49.

## What this adds

`app/features/rule/rule_format/available_format/atr_format.py` (361 LOC)
implements the ATRRule class with:

- `format = "atr"` and `get_class() = "ATRRule"` (RuleType properties)
- `validate(content)` — two-layer validation: syntactic (YAML parses to a
  mapping) + semantic (required fields, ATR-YYYY-NNNNN id pattern,
  ten-category taxonomy, severity enum, agent_source.type enum, detection
  block shape including condition expression + array contents).
- `parse_metadata(content, info)` — emits the rulezet-canonical shape
  (title / description / format / version / source / author /
  original_uuid / to_string / cve_id / tags / severity / license).
  Preserves the ATR rule id in `original_uuid`. Uses explicit
  `references.cve` list when present, falls back to `detect_cve` on the
  description otherwise (same shape contract as sigma_format).
- `get_rule_files(file)` — `.yaml` / `.yml` extension gate.
- `extract_rules_from_file(filepath)` — handles single-rule, list-of-rules,
  and multi-document YAML. Single-rule files are returned VERBATIM so
  quoting and key-ordering are preserved (mirrors sigma's `to_string`
  discipline).
- `find_rule_in_repo(repo_dir, rule_id)` — locates a previously-imported
  rule by stable `original_uuid` or title.
- `detect(content)` — bonus disambiguation helper for the format loader:
  triggers on ATR-specific markers (canonical id pattern, top-level
  `agent_source` field, or `tags.category` in the ATR taxonomy).
  Implemented as a regular instance method (not @AbstractMethod) so the
  existing `RuleType` contract is unchanged; if/when the abstract grows
  a `detect()` method per the discussion in rulezet#49, this implementation
  will already satisfy it.

`tests/rules/test_atr_format.py` (24 tests, all passing) covers:

- format / class identifier
- detect() — positive cases (canonical rule, agent_source-only,
  category-only) and negative cases (Sigma look-alike, non-YAML,
  non-mapping)
- validate() — accepts canonical rule; rejects bad id, unknown category,
  unknown severity, missing detection block, empty conditions array,
  empty input, YAML parse error
- parse_metadata() — explicit CVE list, description fallback, tags
  flattening, error-path shape
- get_rule_files() — accepts .yaml / .yml; rejects .txt / .json
- extract_rules_from_file() — single-rule preserves verbatim, non-ATR
  YAML returns empty, multi-doc returns one entry per mapping

## Style alignment with sigma_format

Section headers, comment block format, error-path return shape, and
the no-re-dump-on-single-rule discipline all mirror the existing
sigma_format adapter. Module imports follow the same order
(stdlib → third-party → app). detect_cve usage matches sigma's
contract (JSON-encoded string output for `cve_id`).

## On the abstract interface

The PR does not modify `rule_type_abstract.py`. The `detect()` method
@ecrou-exact mentioned as a likely future addition is implemented here
as a regular instance method, so it is available for the format
loader to call when present (e.g. via duck-typing or
`callable(getattr(rt, "detect", None))`), without forcing every other
existing format to grow a stub. When the abstract eventually adds it,
this implementation will satisfy `@abstractmethod` as-is.

## Auto-registration

Per `load_all_rule_formats()` in `rule_type_abstract.py`, dropping the
adapter into `available_format/` is sufficient — no manual registry
edit needed.

## Upstream pointer

Agent Threat Rules upstream: https://github.com/Agent-Threat-Rule/agent-threat-rules
(MIT license, currently v2.1.2 / 338 rules across ten attack
categories). The format is referenced by the MISP taxonomy and galaxy
threat-intel sharing layers (MISP/misp-taxonomies#323 +
MISP/misp-galaxy#1207, both merged 2026-05-10 by Alexandre Dulaunoy /
@adulau as the MISP project lead) and is shipped in production by
Cisco AI Defense and Microsoft Agent Governance Toolkit.

Closes / addresses rulezet#49.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ATR - new format to add

2 participants