Summary
Track this as a possible future feature: a metrics layer for both internal health signals (currently surfaced ad-hoc via /health) and API usage analytics (currently not captured anywhere).
Why
Today the API has no first-class metrics emission. Reconciliation health is exposed via the /health endpoint (see openfga_outbox block added in the Option B+C work), and request-level signals exist only in the log files. There's no way to answer questions like:
- How many
/calendar requests did we serve last month?
- Which API keys / applications are the heaviest consumers?
- What's the per-endpoint error rate trend?
- Which calendars (nations / dioceses) are queried most often?
- What's p50 / p95 / p99 latency per endpoint?
These are the questions that drive capacity planning, deprecation decisions, and outreach to heavy users.
Out of scope of the current openfga-reconciliation work
The Option B+C design (see docs/superpowers/specs/2026-06-02-openfga-async-reconciliation-design.md) deliberately uses /health polling rather than a metrics emitter, on the reasoning that adding Prometheus/StatsD wiring for two operational signals would be premature. This issue tracks the eventual graduation to a real metrics framework when more signals justify it.
Candidate substrates
- Prometheus + a
/metrics endpoint — the industry default, scraped by ops infrastructure. Requires the promphp/prometheus_client_php package (or similar) and a place to register collectors.
- StatsD + DogStatsD-compatible emitter — push model, lighter integration, no scraping infrastructure on the host.
- PG aggregation table + scheduled rollup — a
api_requests_hourly (or similar) table populated by middleware on each request, rolled up via a periodic SQL job. No external metrics infrastructure; everything stays in PG. Lower fidelity but zero new dependencies.
Per-question signals to capture
| Signal |
Tags / dimensions |
| Request count |
endpoint, status code, API key ID, application ID |
| Request latency |
endpoint, status code |
| Calendar data fetched |
calendar type (general/national/diocesan/wider), nation, diocese |
| Auth event |
event type (login, refresh, logout), result |
| Outbox state |
(already in /health; could be moved here) |
| OpenFGA call count + latency |
operation (check/listObjects/writeTuple/deleteTuple), result |
Acceptance criteria (when this lands)
- A middleware captures per-request metrics with the dimensions above, with minimal latency overhead (< 1ms p99).
/admin/metrics or /metrics (auth model TBD per substrate choice) exposes the aggregates.
- The outbox observability currently in
/health migrates to the new substrate (or stays in /health if that's the simpler ops model — explicitly decide).
Related
- Option B+C async reconciliation design (the proximate trigger for this issue).
src/Repositories/ApiKeyRepository.php and src/Repositories/ApplicationRepository.php — already model the dimensions we'd tag metrics with.
Notes
This is a track-only issue. No code lands until a separate brainstorming pass picks a substrate and scopes the rollout.
Summary
Track this as a possible future feature: a metrics layer for both internal health signals (currently surfaced ad-hoc via
/health) and API usage analytics (currently not captured anywhere).Why
Today the API has no first-class metrics emission. Reconciliation health is exposed via the
/healthendpoint (see openfga_outbox block added in the Option B+C work), and request-level signals exist only in the log files. There's no way to answer questions like:/calendarrequests did we serve last month?These are the questions that drive capacity planning, deprecation decisions, and outreach to heavy users.
Out of scope of the current openfga-reconciliation work
The Option B+C design (see
docs/superpowers/specs/2026-06-02-openfga-async-reconciliation-design.md) deliberately uses/healthpolling rather than a metrics emitter, on the reasoning that adding Prometheus/StatsD wiring for two operational signals would be premature. This issue tracks the eventual graduation to a real metrics framework when more signals justify it.Candidate substrates
/metricsendpoint — the industry default, scraped by ops infrastructure. Requires thepromphp/prometheus_client_phppackage (or similar) and a place to register collectors.api_requests_hourly(or similar) table populated by middleware on each request, rolled up via a periodic SQL job. No external metrics infrastructure; everything stays in PG. Lower fidelity but zero new dependencies.Per-question signals to capture
Acceptance criteria (when this lands)
/admin/metricsor/metrics(auth model TBD per substrate choice) exposes the aggregates./healthmigrates to the new substrate (or stays in /health if that's the simpler ops model — explicitly decide).Related
src/Repositories/ApiKeyRepository.phpandsrc/Repositories/ApplicationRepository.php— already model the dimensions we'd tag metrics with.Notes
This is a track-only issue. No code lands until a separate brainstorming pass picks a substrate and scopes the rollout.