Skip to content

Enhanced incident memory: Store and learn from observability-enriched RCAs #10

@nomadicmehul

Description

@nomadicmehul

Summary

Upgrade the incident memory system to store richer incident records that include observability evidence (trace IDs, error patterns, metric anomalies), not just text descriptions. Improve similarity matching to use these structured fields.

Current State

TF-IDF text similarity on title + root_cause + resolution text fields stored in JSON.

Desired State

  • Store structured fields: affected_services, error_types, metric_anomalies, trace_patterns
  • Match on both text similarity AND structural overlap (same service + same error type = high match)
  • Weight recent incidents higher than old ones

Acceptance Criteria

  • Extended IncidentRecord model with structured observability fields
  • Hybrid matching: TF-IDF text + Jaccard similarity on structured fields
  • Time decay: incidents from last 30 days weighted 2x
  • Return richer context: "Last time this happened (14 days ago), root cause was X, resolution was Y, MTTR was Z minutes"

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:agentAgent architecture and orchestrationarea:rcaRoot cause analysisphase:2-smart-triagePhase 2 — Smart Triage & Investigationpriority:highImportant, do if time allowstype:enhancementImprovement to existing feature

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions