Skip to content

Auto-generate IaC fix PRs for infrastructure issues #12

@nomadicmehul

Description

@nomadicmehul

Summary

When the agent identifies a root cause that can be fixed via infrastructure-as-code changes (Helm values, Kustomize overlays, K8s manifests, Terraform), generate a ready-to-merge PR with the fix.

Why This Matters

This is the feature that makes TheNightOps go from "tells you what's wrong" to "fixes it for you." Sonarly does this for application code — we do it for infrastructure code.

Examples

Root Cause Generated PR
OOMKill — memory limit too low Bump resources.limits.memory in Helm values
CrashLoop — bad config Revert ConfigMap to last known good version
CPU throttling Increase CPU limits or add HPA config
Failed scheduling — node affinity Update node selector / tolerations

Acceptance Criteria

  • Detect which IaC files manage the affected resources (Helm, Kustomize, raw manifests)
  • Generate a diff with the fix (resource limit change, config revert, etc.)
  • Create a PR on GitHub with: title, RCA summary, diff, evidence
  • PR description includes: what happened, why, what this fix does, rollback instructions
  • Requires policy engine approval (Level 2+ actions)
  • Config: GitHub token, target repo, base branch

Scope

Start with Kubernetes manifest changes (YAML). Helm/Kustomize/Terraform support in future iterations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions