A production-grade, self-healing API gateway built with Java 21, Spring Cloud, and autonomous AI operations.
NeuraGate is an intelligent, self-healing API gateway that goes beyond simple request routing. It observes its own telemetry, detects anomalies in real-time, consults an AI advisor, and — when confidence is high enough — takes corrective action autonomously.
This project demonstrates enterprise-grade engineering concepts including event-driven architecture, reactive programming with Project Loom, autonomous AI-driven operations, and production-ready observability, all built on the modern Java 21 + Spring Cloud ecosystem.
| Feature | Description |
|---|---|
| 🤖 Autonomous AI Advisor | Rule-based heuristics (LLM-ready) generate structured recommendations; confidence-gated auto-execution applies circuit-breaker and rate-limit changes with a full audit trail |
| ⚡ Virtual Thread Performance | All I/O runs on Project Loom virtual threads — carrier-thread-safe, sub-millisecond overhead, scales to thousands of concurrent connections |
| 🔄 Event-Driven Telemetry | Every request publishes a GatewayTelemetry event to Kafka; a parallel consumer pipeline aggregates metrics, detects anomalies, and feeds the AI advisor |
| 🛡️ Self-Healing Circuit Breakers | Resilience4j circuit breakers trip on consecutive failures; the AI advisor lowers failure thresholds before they cascade |
| 📊 Real-Time Dashboard | Dark-themed SSE-powered UI at /index.html; live request feed, 6 metric cards, AI decision log — zero page refreshes |
| 🔐 Reactive JWT RBAC | Stateless SecurityWebFilterChain with three roles: ADMIN, ADVISOR, VIEWER; path-based rules enforce least-privilege access |
| 🔥 Built-in Stress Tester | POST /admin/test/stress/start fires 1,000 req/min at chaos endpoints to prove circuit breaker and telemetry behaviour under pressure |
| 📈 Prometheus Integration | 11 custom Micrometer gauges (gateway.latency.*, gateway.error.rate, gateway.anomalies.total) scraped by a co-located Prometheus container |
| 🗂️ Redis Route Store | Dynamic route definitions persisted in Redis — update routing rules at runtime without restarting the gateway |
┌──────────────────────────────────────────────┐
│ NeuraGate Mesh │
│ │
Client Request ──────► │ Spring Cloud Gateway (Virtual Threads) │
│ │ │ │
│ Rate Limiter Circuit Breaker │
│ (Redis token) (Resilience4j) │
│ │ │
│ Telemetry Producer ──► Kafka ──► Consumer │
│ │ │
│ MetricsBuffer │
│ AnomalyDetector │
│ AI Advisor │
│ ActionExecutor │
│ │
│ Redis ◄─── Dynamic Routes │
│ Prometheus ◄─── Micrometer Gauges │
│ Dashboard ◄─── SSE Stream │
└──────────────────────────────────────────────┘
See ARCHITECTURE.md for the full Mermaid diagram and technical deep dive.
- Docker & Docker Compose
- Java 21+ (for local development only)
# Clone the repository
git clone https://github.com/SahilJ10319/intelligent-service-mesh.git
cd intelligent-service-mesh
# Start the entire mesh (Zookeeper → Kafka → Redis → Prometheus)
docker-compose up -d
# Build and run the gateway
./mvnw spring-boot:runThe gateway starts on http://localhost:8080
# Gateway health
curl http://localhost:8080/actuator/health
# Prometheus metrics
curl http://localhost:8080/actuator/prometheus
# Live dashboard
open http://localhost:8080/index.html
# Prometheus UI
open http://localhost:9090NeuraGate uses stateless JWT bearer tokens with three role levels.
curl -X POST http://localhost:8080/auth/token \
-H "Content-Type: application/json" \
-d '{"username":"neuragate","password":"secret"}'| Role | Paths | Purpose |
|---|---|---|
ROLE_ADMIN |
/admin/** |
Stress tests, chaos controls, system management |
ROLE_ADVISOR |
/ai/analyze, /ai/audit-log, /ai/prompt |
AI analysis and recommendations |
ROLE_VIEWER |
/dashboard/**, /ai/system-prompt |
Read-only metrics and dashboard |
TOKEN="<your-jwt-here>"
# AI Analysis (requires ADVISOR+)
curl http://localhost:8080/ai/analyze \
-H "Authorization: Bearer $TOKEN"
# Trigger stress test (requires ADMIN)
curl -X POST http://localhost:8080/admin/test/stress/start \
-H "Authorization: Bearer $TOKEN"| Endpoint | Method | Description |
|---|---|---|
/ai/analyze |
GET | Run full AI analysis of recent metrics |
/ai/analyze?autoExecute=true |
GET | Run analysis + auto-apply if confidence ≥ 80% |
/ai/audit-log |
GET | View all autonomous config changes |
/ai/prompt |
GET | Inspect the LLM prompt (debug) |
/ai/system-prompt |
GET | View the AI's role definition |
# 1. (Optional) Configure chaos before the test
curl -X POST http://localhost:8080/admin/test/chaos \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"failureRate": 30, "latencyMs": 200}'
# 2. Start the load generator (1,000 req/min for 60s)
curl -X POST http://localhost:8080/admin/test/stress/start \
-H "Authorization: Bearer $TOKEN"
# 3. Watch live SSE progress
curl -N http://localhost:8080/admin/test/stress/events \
-H "Authorization: Bearer $TOKEN"
# 4. Reset chaos after test
curl -X POST http://localhost:8080/admin/test/chaos/reset \
-H "Authorization: Bearer $TOKEN"src/main/java/com/neuragate/
├── config/ # Route config, Kafka, rate limiter, circuit breaker
├── gateway/ # Core filter chain, telemetry producer, fallback
├── telemetry/ # Kafka consumer, MetricsBuffer, AnomalyDetector
├── ai/ # AiAdvisorService, ActionExecutor, ConfigUpdateEvent
├── dashboard/ # DashboardController, SseEmitterService
├── security/ # SecurityConfig, JwtAuthManager, Role enum
├── stresstesting/ # LoadTestService, StressTestController
├── mock/ # ChaosSettings, InventoryController, MockConfigController
├── health/ # GatewayHealthIndicator
└── repository/ # RedisRouteDefinitionRepository
| Days | Milestone |
|---|---|
| 1–5 | Project setup, Spring Cloud Gateway routing, Redis integration |
| 6–10 | Rate limiting, circuit breakers, health checks, Resilience4j |
| 11–15 | Mock chaos service, Kafka integration, event-driven telemetry |
| 16–19 | MetricsBuffer, telemetry consumer pipeline, real-time aggregation |
| 20–21 | Prometheus metrics (11 custom gauges), anomaly detection |
| 22–23 | AI Advisor core, prompt engineering, structured LLM response DTOs |
| 24 | Autonomous ActionExecutor with confidence-gated auto-execution |
| 25 | Real-time SSE dashboard with live request feed and AI decision log |
| 26 | Reactive JWT authentication (HMAC-SHA256, stateless) |
| 27–28 | RBAC (ADMIN/ADVISOR/VIEWER), built-in 1,000 req/min stress tester |
| 29–30 | Production docs, architecture deep dive, docker-compose polish |
This project was built as a 30-day engineering demonstration. Issues, improvements, and pull requests are welcome.