Skip to content

feat: add /metrics/aggregate endpoint for health metrics aggregation …#1046

Open
Bhavy12-cell wants to merge 1 commit into
imDarshanGK:mainfrom
Bhavy12-cell:feat/health-metrics-endpoint-643
Open

feat: add /metrics/aggregate endpoint for health metrics aggregation …#1046
Bhavy12-cell wants to merge 1 commit into
imDarshanGK:mainfrom
Bhavy12-cell:feat/health-metrics-endpoint-643

Conversation

@Bhavy12-cell

Copy link
Copy Markdown

Summary

Closes #643

Problem

Monitoring dashboards had to poll multiple endpoints separately
(/healthz/ready, /metrics, /health) to get a full picture of system
health, making dashboard setup complex.

Solution

Added a single GET /metrics/aggregate endpoint that queries all
subsystems and returns a combined JSON response.

New Files

  • backend/app/routers/aggregate.py
    New router with GET /metrics/aggregate endpoint. Queries database,
    Prometheus, and API process subsystems. Protected by bearer token
    when METRICS_AUTH_TOKEN is set (same token as /metrics).

  • backend/tests/test_aggregate_metrics.py
    Tests covering 200 response structure, degraded state when DB fails,
    auth token enforcement, and prometheus_enabled flag.

Modified Files

  • backend/app/schemas.py
    Added SubsystemStatus and AggregateMetricsResponse Pydantic models.

  • backend/app/main.py
    Registered aggregate_router so /metrics/aggregate is served.

Example Response

GET /metrics/aggregate → 200 OK
{
"overall": "ok",
"version": "3.0.0",
"subsystems": {
"api": { "status": "ok", "elapsed_ms": 0.0 },
"database": { "status": "ok", "elapsed_ms": 1.23 },
"prometheus": { "status": "ok" }
},
"prometheus_enabled": true,
"timestamp": "2026-06-14T18:00:00+00:00"
}

Security

  • Bearer token auth via METRICS_AUTH_TOKEN env var (optional)
  • Returns 401 on missing or wrong token when configured
  • Safe to expose to internal dashboards without token in trusted networks

@github-actions

Copy link
Copy Markdown

👋 This PR has had no activity for 7 days.

Please push updates or comment if you still need more time.

Inactive PRs may be closed automatically after 7 more days.

@github-actions github-actions Bot added the stale label Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add backend endpoint for health metrics aggregation

1 participant