🛡️ AetherGuard: The Agentic SRE Sentinel

AetherGuard is a state-of-the-art Autonomous Infrastructure Self-Healing Sentinel. It leverages Large Language Models (LLMs) and a multi-level Actor-Critic Governance pattern to detect, synthesize, and remediate infrastructure incidents in real-time.

📺 Dashboard Showcase

Behold the 'War Room': A professional, glassmorphism-inspired interface where the Agent's consciousness meets infrastructure operations.

Native OpenTelemetry integration provides deep correlation between traces, logs, and AI interventions.

🧠 Architectural Vision

AetherGuard operates on a Safety-First loop. Every action proposed by the Actor (Planner) is strictly audited by a Critic (Safety Sentinel) before touching the production cluster.

graph TD
    A[Incident Detected] -->|OTel Alert| B(Coordination Hub)
    B --> C{Triple-Prompt Pipeline}
    C -->|Step 1| D[Synthesis Prompt]
    C -->|Step 2| E[Planner Prompt]
    C -->|Step 3| F[Safety Critic Audit]
    
    F -->|REJECTED| G[Log Safety Violation]
    F -->|APPROVED| H[K8s Fabric8 Client]
    
    H --> I[Infrastructure Remediation]
    I -->|Rollout Restart| J[Success Verify]
    
    subgraph "AI Brain (Actor-Critic)"
    D
    E
    F
    end
    
    subgraph "Audit & Dashboard"
    B
    G
    end

🚀 Key Features

Agentic SRE Core: Uses the ReAct (Reason + Act) patterns via LangChain4j.
Multi-LLM Provider Support: Native compatibility with OpenAI, Groq (via OpenAI-compatible standard), and Ollama (for local/private LLM usage).
Safety-First Governance: Actor-Critic pattern ensures no destructive commands (DELETE/DROP) are ever executed.
Native Observability: Correlates OpenTelemetry traces directly with AI remediation steps.
War Room Dashboard: Real-time monitoring using Quarkus Renarde and HTMX.

🛠️ Quick Start Guide

Prerequisites

Java 21 & Maven 3.9+
Docker (For Quarkus Dev Services / PostgreSQL)
Kubectl (Configured to a local/remote cluster)

Setup & Run

Clone the repository:

git clone https://github.com/rjaco/aetherguard.git
cd aetherguard

2.### 3. Start the Infrastructure (Important!) You have two options to run the required services (PostgreSQL + OpenTelemetry):

Option A: Automatic (Recommended)

Ensure Docker Desktop (or Docker Engine) is running.
Simply run mvn quarkus:dev. Quarkus will automatically spin up the necessary containers (Dev Services).

Option B: Manual Control

If you prefer to manage the containers yourself or if Dev Services fails:
```
docker-compose up -d
```
This will start PostgreSQL (port 5432) and Jaeger (port 16686/4317).
Then run mvn quarkus:dev.

4. Configure your LLM Provider

AetherGuard supports OpenAI, Groq, and Ollama. Create a .env file in the root or export variables:

Option A: OpenAI (Default)

LLM_PROVIDER=openai
LLM_API_KEY=sk-proj-...
LLM_MODEL_NAME=gpt-4o-mini

Option B: Groq (Recommended for Speed)

LLM_PROVIDER=openai
LLM_BASE_URL=https://api.groq.com/openai/v1
LLM_API_KEY=gsk_...
LLM_MODEL_NAME=llama-3.3-70b-versatile

Option C: Ollama (Local/Private)

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL_NAME=llama3

Launch the Sentinel:
```
./mvnw quarkus:dev
```
Access the Interface:
- War Room Dashboard: http://localhost:8080/
- Swagger UI (API Docs): http://localhost:8080/q/swagger-ui
- Dev UI: http://localhost:8080/q/dev

🧪 Chaos Demonstration

Run the specialized chaos script to see AetherGuard in action:

./infra/chaos.sh

This will simulate a pod crash and an invalid image tag. Watch the dashboard as AetherGuard synthesizes the error and performs a safe rollout restart.

📂 Project Structure

/src/main/java/com/aetherguard/ai: LangChain4j Interfaces & Tooling.
/src/main/java/com/aetherguard/model: Reactive Persistence Entities.
/src/main/java/com/aetherguard/service: Core Orchestration Logic.
/infra: Kubernetes Manifests, Terraform, and Chaos Scripts.

Built with ❤️ for the SRE and Java Development community.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.mvn/wrapper		.mvn/wrapper
infra		infra
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
compose.yaml		compose.yaml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ AetherGuard: The Agentic SRE Sentinel

📺 Dashboard Showcase

🧠 Architectural Vision

🚀 Key Features

🛠️ Quick Start Guide

Prerequisites

Setup & Run

4. Configure your LLM Provider

Option A: OpenAI (Default)

Option B: Groq (Recommended for Speed)

Option C: Ollama (Local/Private)

🧪 Chaos Demonstration

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ AetherGuard: The Agentic SRE Sentinel

📺 Dashboard Showcase

🧠 Architectural Vision

🚀 Key Features

🛠️ Quick Start Guide

Prerequisites

Setup & Run

4. Configure your LLM Provider

Option A: OpenAI (Default)

Option B: Groq (Recommended for Speed)

Option C: Ollama (Local/Private)

🧪 Chaos Demonstration

📂 Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages