Keep your AI agents healthy and auditable. aiopsx provides a complete operations toolkit for deploying, monitoring, and governing AI agent services in production — with automatic rollback, human-in-the-loop approval gates, and immutable audit logs.
- 🔍 Health Monitoring — Continuous
/healthchecks and Prometheus metrics scraping - 🤖 LLM-Powered Diagnosis — AI analyzes anomalies and suggests remediation
- 🛡️ Policy Gates — Define which actions are auto-approved vs human-approved
- 🔄 Automatic Rollback — If health degrades after an action, automatically revert
- 📜 Audit Trail — Every decision, approval, and action logged to PostgreSQL
- 🖥️ Streamlit Dashboard — Human-in-the-loop approval UI, metrics charts, logs
- 🐳 Docker Compose Ready — One-command deploy with all services
git clone https://github.com/GBOYEE/aiopsx.git
cd aiopsx
cp .env.example .env
docker compose up -dOpen dashboard: http://localhost:8501
Control plane: http://localhost:8000/docs (API docs)
flowchart TD
subgraph "Monitoring"
A[Agent Health] --> B[Alert Engine]
end
B --> C{LLM Diagnosis}
C --> D[Plan Actions]
D --> E{Policy Gate}
E -->|Auto| F[Execute Action]
E -->|Human Approval| G[Streamlit Dashboard]
G -->|Approve| F
F --> H[Post-action Health Check]
H -->|Degraded| I[Rollback]
H -->|OK| J[Log Success]
I --> J
J --> K[(Postgres Audit)]
See docs/architecture.md for detailed explanation.
| Component | Technology |
|---|---|
| Control Plane | FastAPI, asyncio |
| Dashboard | Streamlit |
| Database | PostgreSQL (state + audit) |
| Cache/Bus | Redis |
| LLM | OpenAI / Ollama (pluggable) |
| Monitoring | Prometheus metrics |
| Deployment | Docker Compose |
pytest tests/ -v --cov=app --cov-report=htmlCI runs on every push: lint (ruff), type-check (mypy), tests, coverage.
- Multi-tenant SaaS mode
- OAuth2 integrations (GitHub, Google)
- Advanced RBAC with resource-level permissions
- Audit log UI with filtering and export
- Grafana dashboard packs
- Webhook notifications (Slack, Teams)
We welcome contributions! Please read CONTRIBUTING.md before opening issues or PRs.
Good first issues: documentation, UI polish, additional metric collectors.
MIT — see LICENSE.
Built by Oyebanji Adegboyega • Portfolio • @Gboyee_0
