Honest review of what the pipeline does NOT yet do. Every gap is evidence-backed against
mainas of this PR. Severity reflects operational impact, not implementation difficulty.
Read this with roadmap.md — every gap below has at
least one proposed feature in the roadmap that closes it.
- Critical — pipeline cannot do something it claims to do, or a failure mode is unbounded.
- High — workaround exists but is slow / risky / hard to do under pressure (e.g. during incident response).
- Medium — operational papercut. Forces analyst toil; doesn't block the workflow.
- Low — a clear improvement but not load-bearing today.
- S — under 1 day for a competent engineer who knows the codebase.
- M — 1–3 days.
- L — more than 3 days; usually because of an external API dependency, design ambiguity, or test fixture lift.
-
G6 — No emergency rollback command.Resolved.contentops rollback <sha>now materialisesdetections/at SHA viagit ls-tree+git show, validates + applies against the temp tree, and writes audit records withmessage="rollback to <sha>". Defaults dry-run; honours locks; non-destructive. See the row in the full table. -
G2 — 46 defender_custom_detection reportResolved (twice — drift + apply-verify). Root cause was a server-field stripping divergence between v1 collect and v2 handler (changedin drift, uncovered.detectorId,lastRunDetails, nested timestamps). The drift symptom was fixed by bringingDefenderCustomDetectionHandler.to_envelopein line withcontentops.defender.collect. A separate MISMATCH symptom (every Defender rule reportedverified=Falsepost-apply) was traced to the same divergence on the apply-verify side: post-PUT GET returnedschedule.nextRunDateTimewhich the local body never carried. Fixed by extracting_strip_server_fieldsand applying it symmetrically at bothto_envelope(collect) AND apply-verify time. Thecontentops defender-roundtrip-diffdiagnostic was shipped during the investigation and pinned the root cause on first run; the regression is now covered bytests/v2/test_apply_verify_defender.py::test_apply_verifies_when_remote_has_server_managed_nested_fields. -
G1 — KQL lint is regex-based, not a real parser.
Partially resolved.F1's Python policy rules ship under--strict(KQL101 incontentops/lint/strict_rules.pycatches| take/| limit). The "undefined column / wrong table reference" case still relies on apply-time ARM 400 unless the optionaltools/kql_strict.dll(Kusto.Language) wrapper is installed. Severity downgraded from High to Medium because the load-bearing case is now closed at lint time on every PR.
As of 2026-05-22, G17 and G21 have also been resolved (PR #239), and G25/G26/G27 (operational gaps from PR #237's NVISO borrowings) were captured + closed in the same sprint. As of 2026-05-25, G28/G29/G30 (alert-performance tracking, per-detection health recommendations, and unified multi-audience reporting) have been resolved by the F21/F22/F23 alert-tracking features. The only currently-open engineering gap is G5 (Defender Graph extensions -- blocked on Microsoft GA), plus the G24 content-authoring backlog (human writing work, not engineering).
The remaining gaps below are real but lower-priority.
| Gap | Severity | Evidence | Workaround today | Effort |
|---|---|---|---|---|
| G1 KQL static lint is regex-based, not parser-based — RESOLVED | High → resolved | Closed in two layers + schema loading: (1) Python policy rules under --strict (KQL101 in contentops/lint/strict_rules.py catches | take / | limit); (2) Kusto.Language parser diagnostics via the C# wrapper at tools/kql_strict/, invoked from contentops/lint/strict.py; (3) F1.1 schema loading: wrapper now reads tools/kql_strict/schemas.json at startup and builds a Kusto.Language.GlobalState.WithDatabase(...) with the committed Sentinel + Defender XDR table surface so KS204 / KS142 reflect real schema-bound issues. The schemas baseline is refreshed nightly from /v1/workspaces/<id>/metadata by kql-schemas-refresh.yml (opens a PR on drift) and on-demand via contentops upstream check-schemas --write. Wrapper findings ship at warning severity by default; setting KQL_STRICT_PROMOTE_SEVERITY=1 in the lint workflows promotes them to the upstream Diagnostic.Severity (typically error) — recommended flip after the nightly workflow has filled out the baseline against the live tenant. Wrapper gracefully degrades to no-schema mode if schemas.json is missing / malformed. |
n/a (resolved) | M actual; closed across PRs #224 + the F1.1 PR. |
| G2 46 defender_custom_detection reports CHANGED in drift, never reconciled — RESOLVED | High → resolved | drift-report.json (2026-05-06): defender_custom_detection: total=46 changed=46. Root cause: DefenderCustomDetectionHandler._SERVER_FIELDS in contentops/handlers/defender_custom_detection.py was missing detectorId and lastRunDetails, and lacked the nested-strip logic v1's contentops/defender/collect.py had for queryCondition.lastModifiedDateTime and schedule.nextRunDateTime. Local YAMLs were collected via v1 (which strips them); v2 drift kept them, so every rule diffed. Fixed in this PR: extended _SERVER_FIELDS + added _SERVER_NESTED_FIELDS. Regression covered by test_defender_to_envelope_strips_g2_server_fields and test_defender_drift_clean_after_g2_fix in tests/v2/test_drift_roundtrip.py. |
n/a (resolved) | S (was estimated M; actual fix was 4 lines in _SERVER_FIELDS + 6 lines for nested strip + 2 tests). |
| G3 Marketplace catalog upstream-watcher deferred — RESOLVED | Medium → resolved | Closed by contentops upstream check-marketplace + the scheduled upstream-watchers.yml workflow (Mondays 07:00 UTC). The CLI diffs Sentinel's contentPackages ARM resource against the committed baseline at manifests/upstream_marketplace.json and writes docs/whats-new/<YYYY-MM-DD>.md when a change is detected; the workflow then opens a PR via peter-evans/create-pull-request. Implementation: contentops/upstream/marketplace.py (fetch + normalise), contentops/upstream/manifest.py (diff core), contentops/cli/commands/upstream.py (CLI). Tests under tests/v2/test_upstream_*.py. Revives the F7 plumbing that was removed in Phase 1E, scoped tighter (watcher-only; no auto-install). |
n/a (resolved) | M actual. |
| G4 Templates catalog upstream-watcher deferred — RESOLVED | Medium → resolved | Closed by contentops upstream check-templates + the same upstream-watchers.yml workflow as G3. Reuses the manifest-diff machinery; baseline at manifests/upstream_templates.json. Implementation: contentops/upstream/templates.py (fetch + normalise via provider.list_resource("alertRuleTemplates"), the same call contentops new --search-template uses for on-demand lookup). |
n/a (resolved) | S actual — shared plumbing with G3. |
| G5 Defender Graph extensions deferred | Medium | docs/assets/defender_graph_extensions_deferred.md explicitly: savedQueries, detection-tuning rules, alert suppression — endpoints not GA. No handler files for these under contentops/handlers/. |
Manage in portal. | L (gated on Microsoft shipping the endpoints; we own only the re-probe logic). |
| G6 No emergency rollback command — RESOLVED | High → resolved | Closed by F3 in this PR. contentops rollback <sha> now materialises detections/ at SHA via git ls-tree + git show, validates and applies against a temp tree, and writes audit records with message="rollback to <sha>". Implementation: contentops/rollback.py, CLI in contentops/cli/commands/rollback.py (rollback_cmd), 13 tests in tests/v2/test_rollback.py. Defaults dry-run; honours localCustomization: true locks; non-destructive (assets that exist today but not at SHA are left alone — run prune for full reset). |
n/a (resolved) | M actual; the design sketch was accurate. |
| G7 No silent-rule detection — RESOLVED | Medium → resolved | Closed by F4. contentops silent-rules queries the workspace's SecurityAlert + SecurityIncident tables and surfaces per-rule counts with a configurable --since lookback (default 30d). Implementation: contentops/cli/commands/silent_rules.py (docstring: "Closes G7."), registered in contentops/cli/__init__.py, backed by contentops/workspace_kql.py — the Log Analytics Query API bootstrap this row's effort estimate flagged as the missing dependency. Scheduled in .github/workflows/silent-rules.yml. Tests: tests/v2/test_workspace_kql.py covers the KQL helper; tests/v2/test_workflow_state_and_telemetry.py covers the workflow path. |
n/a (resolved) | M actual; estimate matched. |
| G8 No analytic test harness — DEFERRED | Medium | No pipeline rule test command. F2 in roadmap.md is parked on the far-future backlog. If revived, the path is CSV fixtures or retrospective via Sentinel/Defender API (reusing the workspace_kql helper from F4) — not the Python KQL evaluator originally proposed. |
Run the rule against historical data manually in the portal. | L (CSV authoring is bounded; retrospective-API is a thin wrapper on F4). |
| G9 No cost/quota awareness — OUT OF SCOPE | n/a | Cost in this org is driven by ingest, not by detection rules. Detection KQL reads data that's already been paid for at ingest time, so a per-rule "GB scanned" estimate is the wrong abstraction (would mislead operators into tuning detections to lower a bill that won't actually move). The historical cost-heuristic lint rule (and the whole supporting module) was removed for the same reason in PR #234. F5 (pipeline cost) is rejected — see roadmap.md. Cost optimisation belongs in a separate ingest-side workflow. |
n/a (out of scope) | n/a |
| G10 No drift-resolution UX — RESOLVED (partial) | Medium → resolved | contentops drift-resolve <id> --strategy {git|remote} ships; --strategy merge raises NotImplementedStrategy by design — operators pick git or remote per rule. Implementation: contentops/drift_resolve.py; tests in tests/v2/test_drift_resolve.py. |
n/a (resolved) | M actual. |
G11 Pipeline-deployed envelopes still carry id: sentinel-<guid> — RESOLVED |
Low → resolved | All 101 envelopes under detections/sentinel_analytic/ now carry slug-based ids (e.g. a-user-added-an-account-to-a-privileged-role) — grep -c '^id: sentinel-[0-9a-f]\{8\}-' detections/sentinel_analytic/*.yml returns 0. The slugified-id transition is complete. metadata.arm_name preserves the original ARM resource GUID (contentops/handlers/sentinel_analytic.py:304) so apply and prune still address the right remote resource without leaking provenance into the user-facing envelope id. The contentops collect --rename-existing flag at contentops/cli/commands/collect.py:82 performed the migration. |
n/a (resolved) | S — already complete; no further work needed. |
| G12 Operational/incident collection has no cap or archival path — OBSOLETE | n/a → obsolete | The seven operational asset kinds (incidents, incident tasks, watchlist items, four workspace-manager kinds), their handlers, the OPERATIONAL_ASSETS set, and the --include-operational opt-in flag were all deleted in the asset-taxonomy reduction (PR #122). Only 6 configuration kinds are managed today; there is no operational-data fan-out path to cap. |
n/a (no longer applicable) | n/a |
| G13 Lifecycle gate is implicit — PARTIALLY RESOLVED | Medium → Medium (partial) | status: experimental is the only one that doesn't deploy (contentops/handlers/sentinel_analytic.py:117). Closed by F8 to the extent achievable without F2: contentops lifecycle promote <id> runs four gates — status_is_experimental, recent_validation (META001's source of truth), live_test_pass (DEFERRED — depends on F2 which is parked), and fp_rate_threshold (LIVE when --workspace-id / PIPELINE_WORKSPACE_ID is set; computes closed_fp_30d / incidents_30d against the threshold in config/lifecycle.yml, fail-closed on workspace errors). Implementation: contentops/lifecycle.py; CLI in contentops/cli/commands/lifecycle.py; tests in tests/v2/test_lifecycle_promote.py. Fully closes only when F2 ships. |
Run contentops lifecycle promote <id> (or --force with reviewer approval recorded out-of-band when the live-test gate would have been load-bearing). |
S (remaining: F2 live-test gate). |
| G14 No content-coverage gap analysis vs MITRE — RESOLVED | Medium → resolved | Closed by F9 in this PR. contentops coverage --gaps enumerates MITRE techniques NOT covered by any detection. Implementation: contentops/coverage/gaps.py; bundled reference list at contentops/coverage/data/mitre_attack_techniques.json (curated subset of ~70 high-value techniques). --techniques-file FILE swaps in a custom list (e.g. parsed from MITRE STIX or an org threat model). 14 tests in tests/v2/test_coverage_gaps.py. |
n/a (resolved) | S — actual implementation matched the design sketch effort estimate. |
| Gap | Severity | Evidence | Effort |
|---|---|---|---|
| G15 State file orphan-branch checkout not wired in any workflow — RESOLVED | Medium → resolved | The CLI (`contentops state sync push | pull |
| G16 Audit chain has no monotonic timestamp guarantee — RESOLVED | Low → resolved | Closed by _monotonic_timestamp in contentops/audit/writer.py:130. Module docstring (lines 15-34) explicitly cites G16 and explains the contract: within a batch, _chain_records advances each timestamp to max(record.timestamp, prev_timestamp + 1µs); across batches, write_records seeds the chain with the previous tail's timestamp; verify_chain enforces current_timestamp >= prev_timestamp so pre-bump records on disk still verify. CLAUDE.md invariant §9 ("Audit trail is hash-chained + monotonic") reflects this. |
n/a (resolved) |
| G17 No automated pre-flight diff against the live tenant on PR — RESOLVED | Medium → resolved | Closed by PR #239. contentops plan --against-tenant extends the plan command to also call contentops.core.drift.detect_drift and overlay an apply-side summary: CREATE: N · UPDATE: M · NO-CHANGE: K · ORPHAN-IN-TENANT: J. Drift's framing ("what's in tenant that's not in repo") is translated to apply-side verbs ("what would apply do"). Implementation: contentops/cli/commands/apply.py (plan_cmd extended with --against-tenant flag + _print_against_tenant_summary); tests in tests/v2/test_plan_against_tenant.py (4 tests). Default OFF so fork PRs / offline runs keep working; remote-list failures degrade to a banner without failing the command. The original PR-time piece was already covered operationally via the drift-pr job in drift.yml; this commit adds the sharper "apply preview" framing for operators running plan locally before merging. |
n/a (resolved) |
| G18 No bulk-disable / cohort-disable command — RESOLVED | Medium → resolved | contentops disable ships with three selectors: positional rule_id, --pattern <glob> (envelope id glob), and --cohort <name> (exact match against metadata.cohort). The inverse contentops enable mirrors the same three selectors plus --to {experimental,production,test} (default experimental so re-promotion still goes through F8's gates). Both commands keep a symmetric audit trail in the YAML: disable writes disableReason: "..." (or a dated comment), enable strips that marker and writes enableReason: "..." (or its own dated comment). Implementation: contentops/cli/commands/lifecycle.py disable_cmd / enable_cmd. Tests: tests/v2/test_disable_pattern.py (12 tests) + tests/v2/test_enable_pattern.py (15 tests including a disable→enable round-trip). F14 (retry-failed --since/--run-id) covers the recovery side of the same papercut; this row covers the cohort lifecycle side. |
n/a (resolved) |
G19 metadata.lastValidatedAt is read but not enforced — RESOLVED |
Low → resolved | Closed by lint rule META001 in contentops/lint/metadata_rules.py (lines 115-171), wired into the runner at contentops/lint/runner.py:168. The rule emits a warning for missing lastValidatedAt, escalates to error on unparseable, and warns when the field is older than threshold (180 days default, 90 for status: production). Under policy.scaffoldStrict=true the staleness finding escalates to error so validate.yml's lint --strict step blocks the PR. |
n/a (resolved) |
| G20 No telemetry-overlay on the portfolio report — RESOLVED | Medium → resolved | Closed by F20. contentops portfolio --with-telemetry --workspace-id <id> --telemetry-since 30 adds four operational columns (alerts_30d, incidents_30d, closed_fp_30d, fp_rate) to the per-rule report. Implementation: contentops/cli/commands/portfolio.py lines 21-156, sharing F4's workspace-KQL helper via telemetry_query() in contentops/workspace_kql.py line 157. Tests: tests/v2/test_portfolio_telemetry.py (5 tests). Scheduled nightly via portfolio.yml; telemetry is opt-in via the PIPELINE_WORKSPACE_ID Actions variable so the workflow still runs in inputs-only mode when the variable is unset. The original sketch's fifth column (est_gb_scanned_per_day) was dropped because F5 is rejected — cost lever is on ingest, not detections. |
n/a (resolved) |
G21 experimental → test lifecycle has no test workspace — RESOLVED |
Low → resolved | Closed by PR #239. WorkspaceRole extended to include test (contentops/config.py); routing semantics differentiated in contentops/core/env_status.py: _DEDICATED_TEST_ALIASES = {"test"} accepts only {TEST, DEPRECATED} (production envelopes do NOT spill into a dedicated test workspace — test workloads stay isolated from live prod); _INTEGRATION_ALIASES = {"integration", "staging", "stage"} keeps the historical {TEST, PRODUCTION, DEPRECATED} shared-lower-env behaviour. Every --role flag on the 13 affected CLI commands now accepts test alongside the three existing values. Tests: tests/v2/test_workspace_role_test.py (4 tests) + updated tests/v2/test_env_status_filter.py (split the dedicated-test semantic from integration). Operators who want isolated test deployments configure a workspace with role: test; operators who prefer the old shared-lower-env pattern continue to use role: integration. |
n/a (resolved) |
G22 No restore-from-export command (inverse of contentops collect archive) — RESOLVED |
Medium → resolved | Closed by F10. contentops restore <archive.tar.gz> reads a collect snapshot (with optional MANIFEST.json) and restores detections/<asset_kind>/*.yml under --out (default detections/). Refuses to overlay a non-empty target without --force. Defends against path-traversal entries. Implementation: contentops/restore.py; CLI registered in contentops/cli/commands/archive.py (restore_cmd). Tests: tests/v2/test_restore.py. |
n/a (resolved) |
| G23 No content-aware diff between two collect snapshots — RESOLVED | Low → resolved | Closed by F12. contentops snapshot-diff <a.tar.gz> <b.tar.gz> indexes envelopes by (asset_kind, envelope_id) and diffs payloads using the same per-handler hash projection the apply path uses, so file renames + reordering don't surface as noise. Output: per-asset list of created / updated / deleted / unchanged. Implementation: contentops/snapshot_diff.py; CLI registered in contentops/cli/commands/archive.py (snapshot_diff_cmd). Tests: tests/v2/test_snapshot_diff.py. Pairs naturally with F10 (restore). |
n/a (resolved) |
| G24 Authoring-metadata backlog on collected production rules | Medium | 51 production detection envelopes (every file under detections/) lack the META002-005 authoring fields (metadata.description, metadata.attackDescription, metadata.references, metadata.falsePositives). Each fires as a WARNING under the lint policy. As of PR #241, policy.scaffoldStrict defaults to False (lenient) so adopters with a fresh config/tenant.yml aren't blocked by the backlog out of the box (see [[feedback_internal_fail_fast_public_smooth]]); the operator's own internal tenant.yml carries scaffoldStrict: true to keep the fail-fast posture on authored content. Closing this gap is human content authoring — one paragraph of description, one of attackDescription, at least one URL in references, at least one entry in falsePositives per rule. |
L (51 rules × 4 fields; can't be synthesised meaningfully). |
G25 auto-disabled-rules silently returns 0 rows when SentinelHealth diagnostic is off — RESOLVED |
Low → resolved | Closed by PR #239. contentops doctor --auth now runs _check_sentinel_health() at contentops/devex/doctor.py which fires a SentinelHealth | take 1 probe against the prod workspace. PASS when rows exist; WARN with a doc link (https://learn.microsoft.com/en-us/azure/sentinel/health-audit) when zero rows are returned, distinguishing "all rules healthy" from "diagnostic not configured" without forcing the operator into the Azure portal. Tests: tests/v2/test_doctor_sentinel_health.py (6 tests). |
n/a (resolved) |
G26 Dead URLs in references[] only surfaced by weekly cron, not on the introducing PR — RESOLVED |
Low → resolved | Closed by PR #239. scripts/check_references.py learned a --diff-base REF flag that walks only URLs newly added by the PR diff (via git diff --name-only + per-file before/after URL extraction). Wired into .github/workflows/validate.yml as a fast PR step; the weekly full HEAD-check in references-check.yml stays as the safety net. Tests: tests/v2/test_check_references_diff.py (5 tests). |
n/a (resolved) |
G27 Fork-PR contributors get degraded signal on drift-pr / tuning-preview / validate.yml schema refresh — RESOLVED (doc) |
n/a → resolved | Closed by PR #239 (documentation). A new section in docs/onboarding.md titled "Contributing from a fork" documents which checks degrade on fork PRs (OIDC token unavailable), which still work (lint, tests, references URL check, structural plan), and the workaround (rebase into the base repo for full signal). Design is correct as-is; the gap was operator awareness. |
n/a (resolved) |
| G28 No alert-to-detection performance tracking — RESOLVED | Medium → resolved | Closed by F21. contentops alerts sync fetches alerts from Graph alerts_v2 (Defender, 30d lookback) and Sentinel ARM incidents (90d lookback) into a PII-free JSONL ledger with watermark-based incremental sync. Graph detectorId capture enables reliable alert-to-detection correlation via a four-tier matching strategy (ARM GUID, exact title, alert format prefix, substring containment). Upsert logic handles reclassifications (TP→FP, FP→TP) idempotently. Daily rollup store provides gap filling, idempotent rebuild, and version tracking. Config: config/tenant.yml alerts: block with defenderLookbackDays, sentinelLookbackDays, ledgerRetentionDays, rollupRetentionDays. | n/a (resolved) | M actual. |
| G29 No per-detection health recommendations — RESOLVED | Medium → resolved | Closed by F21. contentops alerts health computes per-detection health with six recommendation categories: TUNE (FP rate > 40%), CLASSIFY (>50% unclassified alerts with >5 total alerts), SILENT (0 alerts in period), HEALTHY (TP rate > 80%), REVIEW (metrics outside normal thresholds), EXPECTED_SILENT (detection marked as expected-silent). Includes owner mapping via config/owners.yml, version tracking, and expected-vs-actual volume comparison. --sync-owners auto-populates the ownership file. | n/a (resolved) | M actual. |
| G30 No unified multi-audience report — RESOLVED | Medium → resolved | Closed by F22. contentops report --unified renders a single self-contained HTML report for CEO (posture score), CISO (MITRE heatmap), SOC Manager (owner accountability matrix), Engineers (per-detection health + attention queue), and Hunters (silent/uncovered gaps). Consumes the detection health report, MITRE coverage data, and ownership mapping. | n/a (resolved) | M actual. |
These are not gaps in the same sense — the team has explicitly chosen not to build them, and the choice is documented:
- Multi-tenant fan-out. Single-tenant model is fixed in
CLAUDE.mdand load-bearing incontentops/config.py. Any roadmap proposal must preserve it. - Replacement for the Sentinel UI for ad-hoc investigation. The pipeline is a configuration management tool; live triage and investigation stay in the portal.
- Workspace bootstrapping at scale.
contentops bootstrapdoes one workspace; orchestrating many is out of scope. - Playbook authoring. Logic Apps Designer remains the source of truth for playbook internals; we manage deployment only.
Every gap above passes the test "if I read the cited file or run
the cited command, the gap is observable." Where the citation is a
line rather than a file, the line still exists at the SHA this PR
is opened from. Reviewers: spot-check 3–5 entries against main
before approving.