Skip to content

Investigation flow: Agent queries observability BEFORE recommending actions #6

@nomadicmehul

Description

@nomadicmehul

Summary

Update the investigation pipeline so the agent always queries connected observability tools as part of its standard investigation flow, not just K8s state.

Current Behavior

  1. Alert received → 2. Check K8s pods/events/deployments → 3. Check logs (Cloud Logging only) → 4. Generate RCA

Desired Behavior

  1. Alert received → 2. Check K8s pods/events/deployments → 3. Query observability tools (Grafana/DD/Sentry) for logs, traces, errors, metrics → 4. Correlate infra + app signals → 5. Generate enriched RCA

Acceptance Criteria

  • Root Orchestrator includes observability query phase
  • Log Analyst agent uses observability providers (not just Cloud Logging)
  • Deployment Correlator cross-references deploy timing with error rate changes from observability
  • RCA output includes evidence from observability tools (trace IDs, error stacks, metric anomalies)
  • Works gracefully when no observability tool is configured (falls back to K8s-only)

Impact

This is the core behavioral change that transforms TheNightOps from "K8s state checker" to "full-stack investigator"

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:agentAgent architecture and orchestrationarea:rcaRoot cause analysisphase:1-observabilityPhase 1 — Deep Observability Integrationpriority:criticalMust-have for the phasetype:enhancementImprovement to existing feature

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions