Tutorial 07 — CVE response end-to-end

Status: active

What you'll learn: Drive the full CVE response pipeline on a synthetic CVE — exposure calculation, AI-generated runbook, operator approval, orchestrated remediation via rolling upgrades, verification, learning capture.

Time: ~75 min (mostly waiting for batched remediation)

Builds on: Tutorial 06 — CVE remediation orchestrates the same rolling_module_upgrade skill across all affected modules. You should have the rolling upgrade machinery working end-to-end first.

Sets you up for: Tutorial 09 — Honeypot canary — different defensive posture (detection of unauthorized access vs. patching of known vulnerabilities); both use the same autonomy + approval rails.

What you're building

flowchart TD
    Op[Security operator] --> Inject[Inject synthetic CVE<br/>DRILL: CVE-2026-99001]
    Inject --> EC[ExposureCalculator]
    EC --> Calc[Walk SBOMs<br/>match affected_packages]
    Calc --> Exp[CveExposure rows:<br/>system-base × 42 instances<br/>nginx × 18 instances]
    Exp --> Triage[CVE Responder tick<br/>cve_response skill]
    Triage --> Plan[Plan:<br/>system-base 1.0.3 → 1.0.4<br/>nginx 1.24 → 1.26]
    Plan --> RG[Concierge execute_agent<br/>cve_runbook_generate skill]
    RG --> Page[Page persisted<br/>shareable runbook]
    Plan --> Apr[ApprovalRequest<br/>risk_score=95]
    Apr --> Op2{Operator approves?}
    Op2 -->|yes| CRE[CveResponseExecutor]
    CRE --> RU1[rolling_module_upgrade<br/>system-base 42 instances]
    CRE --> RU2[rolling_module_upgrade<br/>nginx 18 instances]
    RU1 --> Verify
    RU2 --> Verify[ExposureCalculator re-run]
    Verify --> Final{exposed_count == 0?}
    Final -->|yes| Done[create_learning<br/>compound learning]
    Final -->|no| AF[attribute_failure<br/>silent instances]

By the end you'll have rehearsed the CVE response flow against a synthetic CVE and produced a learning that compounds for the team.

Concept refresher

The CVE response pipeline has 4 stages:

Ingest — CVE records come from NVD (hourly SystemCveFeedJob) or are injected manually for drills. Match keys are affected_packages (name + version range). The feed ingester (CveOps::FeedIngestService) paginates the NVD 2.0 API (resultsPerPage/startIndex), pulling up to POWERNODE_NVD_MAX_PAGES × POWERNODE_NVD_PAGE_SIZE results per run (default 25 × 2000 = up to 50k CVEs), so a large window is fully pulled across the loop rather than capped at one page.
Expose — ExposureCalculator walks each NodeModule's SBOM (ingested per Tutorial 02 from cosign attestations) and computes per-instance exposure. Output is one CveExposure row per (CVE, NodeInstance) pair.
Triage — cve_response skill builds a remediation plan: for each affected module, what version bump removes the exposure, batched by instance count.
Remediate — CveResponseExecutor orchestrates one rolling_module_upgrade per affected module. Parallel when modules are independent; sequenced when dependency_spec requires it.

requires_approval always fires for risk_score ≥ 50. Lower-risk CVEs can have auto-remediation policies, but production deployments typically gate everything.

Why drills matter: the muscle memory of CVE response is what saves time during a real incident. Quarterly drills produce learnings that compound.

Prerequisites

Requirement	How
Tutorial 06 completed	You understand rolling upgrade + circuit breaker behavior
Fleet using `system-base` and/or `nginx` modules with SBOMs ingested	Tutorial 02 covers SBOM ingestion via Stage 2 CI
Operator with `system.cve_remediate` approval rights	Default for admin users
(Optional) `Ai::Concierge` configured for runbook generation	See AI agents in CLAUDE.md

Step 1 — Inject a synthetic CVE

For drills, inject directly (real CVEs come from NVD feed automatically):

platform.system_create_cve({
  cve_id: "DRILL-CVE-2026-99001",
  severity: "critical",
  cvss_score: 9.8,
  affected_packages: [{ name: "openssl", version: "<3.1.4" }],
  summary: "DRILL: Synthetic OpenSSL TLS handshake RCE",
  published_at: "2026-05-17T00:00:00Z"
})
// → { cve: { id, ... } }

Expected outcome: CVE row created in severity: critical.

Drill naming convention: always prefix synthetic CVE IDs with DRILL- so they're never confused with real NVD records in audit logs and learnings.

Step 2 — Calculate exposure

The platform runs ExposureCalculator automatically on new CVE rows (via the system_cve_feed worker), but for a fresh drill you can trigger it directly:

platform.system_get_cve_exposure({ cve_id: "DRILL-CVE-2026-99001" })
// → {
//      exposed_modules: [
//        { id: "mod-system-base", name: "system-base", version: "1.0.3", assignment_count: 42 },
//        { id: "mod-nginx",       name: "nginx",       version: "1.24.0", assignment_count: 18 }
//      ],
//      exposed_instance_count: 60,
//      total_fleet_size: 150,
//      exposure_pct: 40.0
//    }

Expected outcome: 40% fleet exposure — needs a coordinated response.

Step 3 — Triage with the `cve_response` skill

There is no execute_skill MCP action — the system-cve-response triage skill is an executor bound to the CVE Responder agent, which runs it automatically on its 60-second reconcile tick when a CvePublishedSensor signal fires for the new CVE. To inspect the triage result, read the exposure (Step 2) and the agent's recent events; the reconciler supplies the skill inputs from the CveExposure rows. The triage produces:

// system-cve-response output (computed by the CVE Responder reconcile tick —
// not a direct operator call):
{
  "cve_id": "DRILL-CVE-2026-99001",
  "risk_score": 95,
  "exposed_modules": [],
  "exposed_instance_count": 60,
  "remediation_plan": {
    "actions": [
      { "module": "system-base", "from": "1.0.3", "to": "1.0.4", "instance_count": 42, "batch_size": 4, "estimated_seconds": 2520 },
      { "module": "nginx",       "from": "1.24.0", "to": "1.26.0", "instance_count": 18, "batch_size": 2, "estimated_seconds": 1080 }
    ],
    "estimated_seconds": 3600
  },
  "requires_approval": true
}

Expected outcome: 1-hour total remediation estimate; 2 parallel module upgrades.

Step 4 — Generate operator runbook

The system-cve-runbook-generate skill is a read-shape executor bound to the System Concierge, so you trigger it by asking the Concierge in chat — or programmatically via execute_agent (resolve the Concierge's UUID with list_agents first; there is no execute_skill action):

platform.list_agents()
// → { agents: [{ id: "<concierge-uuid>", name: "System Concierge", ... }, ...] }

platform.execute_agent({
  agent_id: "<concierge-uuid>",   // execute_agent also accepts a slug or exact name
  input: { input: "Generate a CVE remediation runbook for DRILL-CVE-2026-99001 and persist it as a page." }
})
// The Concierge calls the system-cve-runbook-generate executor internally and returns:
// → { runbook_markdown: "# DRILL-CVE-2026-99001 — Remediation Runbook\n\n...",
//     persisted_page_id: "page-..." }

Expected outcome: runbook surfaces:

Exposed module list + version-bump targets
Step-by-step remediation actions (with command snippets)
Verification procedures (post-fix exposure recompute)
Communication template for stakeholders

Page is shareable with the security team via /app/wiki or Slack.

Step 5 — Approval

Operator opens /app/approvals → sees the proposed plan + risk_score (95)

generated runbook from Step 4 → optionally adjusts batch_size per module (smaller for control-plane modules; larger for stateless) → clicks Approve.

Once approved, autonomy reconciler executes the plan.

Step 6 — Watch the remediation

platform.recent_events({ kind_prefix: "cve", limit: 50 })
// → cve.remediation_started, cve.remediation_module_started, cve.module_remediation_completed, ...

platform.recent_events({ kind_prefix: "module.upgrade", limit: 100 })
// → standard rolling upgrade events from Tutorial 06

Or via UI: /app/system/operations → CVE response panel shows progress per module.

Verification

After all batches complete (~1 hour):

platform.system_get_cve_exposure({ cve_id: "DRILL-CVE-2026-99001" })
// → { exposed_instance_count: 0, exposed_modules: [], ... }

Expected outcome: zero exposure. If non-zero, some instances were silent during the upgrade window:

// system-attribute-failure is a read-shape skill bound to the Concierge —
// invoke it by asking the Concierge (no execute_skill action exists):
platform.execute_agent({
  agent_id: "<concierge-uuid>",   // from list_agents (see Step 4); slug/name also accepted
  input: { input: "Attribute the reconcile failure on instance <silent-instance> looking back 24 hours." }
})
// → the Concierge runs the system-attribute-failure executor and returns the
//   diagnosis of why this instance didn't reconcile

Extract a learning

platform.create_learning({
  title: "DRILL: DRILL-CVE-2026-99001 OpenSSL response — 60-instance scope",
  category: "discovery",
  content: "Synthetic critical CVE drill. Exposure correctly identified 60 instances across system-base + nginx. Total fix duration: 58 min. system-base bump (1.0.3→1.0.4) ran first per remediation plan (42 instances, 4-batch, 0 failures). nginx (1.24→1.26) ran in parallel after first 2 batches of system-base completed (per dependency_spec). Zero circuit breaker trips. Lessons: parallel module upgrades work when modules are independent; sequencing was correct via DependencyResolutionService.",
  tags: ["cve-drill", "openssl", "incident-response", "rolling-upgrade"],
  related_entities: [
    { type: "cve", id: "DRILL-CVE-2026-99001" },
    { type: "module", name: "system-base" },
    { type: "module", name: "nginx" }
  ]
})

Cleanup (drill only)

platform.system_delete_cve({ cve_id: "DRILL-CVE-2026-99001" })

For a real CVE, do not delete — keep the record for audit. The CveExposure rows transition to remediated and stay queryable.

Troubleshooting

CVE MCP actions unreachable — system_create_cve, system_get_cve, system_get_cve_exposure, and system_delete_cve are all registered actions. If a call fails it's an auth/permission issue, not a missing action; confirm your token carries the CVE permissions. Direct-model fallbacks if you ever need them from a console (cd server && bundle exec rails console):

Insert: System::Cve.create!(cve_id: "DRILL-...", severity: "critical", ...)
Query exposure: System::CveExposure.where(cve_id: ...)
Delete: System::Cve.find_by(cve_id: ...).destroy

Triage skill returns risk_score: 0 — SBOM ingestion isn't seeing the affected packages. Verify:

Module CI has the syft step (Tutorial 02 step 7)
The webhook landed (platform.recent_events({ kind_prefix: "system.sbom.ingested" }))
Package names in your affected_packages match exactly what syft emitted

Operator approval queue empty after triage — cve_response skill ran but didn't create the ApprovalRequest. Check the agent's intervention policy for system.cve_remediate:

// agent_introspect resolves by UUID only — look up "CVE Responder" first:
platform.list_agents()
// → { agents: [{ id: "<cve-responder-uuid>", name: "CVE Responder", ... }, ...] }
platform.agent_introspect({ agent_id: "<cve-responder-uuid>" })

If the policy says auto_approve for low-severity CVEs and your synthetic is critical, it should always require approval — file an issue if not.

Remediation completes but exposure recompute still shows non-zero — race condition or silent instances. Cross-check:

platform.system_list_instances({
  template_id: "<affected-template>",
  exclude_running_module_digest: "<new-digest>"
})
// → instances that still report the old digest

For each one, run attribute_failure to find out why it didn't reconcile.

Container image CVEs aren't detected — cve_response matches against NodeModule.package_spec (i.e., OS packages), not container image contents. For images, use external scanners (Trivy, Grype) per Use Case 10 in USE_CASE_MATRIX.md.

What's next

Tutorial 08 — Instance pools — for stateless workloads, the pool-replacement remediation strategy (terminate old, claim new-version from pool) is faster and safer than in-place upgrade.
runbooks/cve-response.md — full operator CVE runbook with SBOM-aware matching details.
SKILL_EXECUTORS.md — cve_response, cve_runbook_generate, cve_remediation_orchestration reference.
CVE Responder agent — autonomous remediation agent that runs this pipeline on every CVE feed update; see CLAUDE.md AI agents section.
Run drills quarterly — every drill produces a learning that compounds. Real incidents reuse the muscle memory.

Last verified: 2026-06-03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial 07 — CVE response end-to-end

What you're building

Concept refresher

Prerequisites

Step 1 — Inject a synthetic CVE

Step 2 — Calculate exposure

Step 3 — Triage with the `cve_response` skill

Step 4 — Generate operator runbook

Step 5 — Approval

Step 6 — Watch the remediation

Verification

Extract a learning

Cleanup (drill only)

Troubleshooting

What's next

FilesExpand file tree

07-cve-response.md

Latest commit

History

07-cve-response.md

File metadata and controls

Tutorial 07 — CVE response end-to-end

What you're building

Concept refresher

Prerequisites

Step 1 — Inject a synthetic CVE

Step 2 — Calculate exposure

Step 3 — Triage with the cve_response skill

Step 4 — Generate operator runbook

Step 5 — Approval

Step 6 — Watch the remediation

Verification

Extract a learning

Cleanup (drill only)

Troubleshooting

What's next

Step 3 — Triage with the `cve_response` skill