[BOUNTY] Add health check Prometheus stale-metric guard (#6)#19
Closed
leo202000 wants to merge 2 commits into
Closed
[BOUNTY] Add health check Prometheus stale-metric guard (#6)#19leo202000 wants to merge 2 commits into
leo202000 wants to merge 2 commits into
Conversation
Flatten health check results into metric records and annotate each with age_seconds and a stale flag before Prometheus export. Adds --prometheus and --stale-threshold flags, a stale_metrics array in JSON output (service/environment/metric_name/timestamp/stale), secret redaction for diagnostic output, OPERATIONS.md docs, and unit tests covering fresh and stale metrics. Addresses bounty mannowell#6.
|
/claim I'll implement this bounty task. |
Owner
|
Thanks for the PR. I am closing this because |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a stale-metric guard to the health check Prometheus output, addressing bounty #6. Health check results are flattened into metric records, each annotated with its age and a stale flag before export, so outdated metrics are flagged rather than silently scraped.
Changes
tools/health_check.py:collect_health_metrics()flattens services/infrastructure/system results into metric records carryingservice,environment,metric_name,timestamp, andstatus.flag_stale_metrics()annotates each metric withage_secondsand astaleflag (stale when age exceedsSTALE_METRIC_THRESHOLD_SECONDS, default 300s; metrics without a usable timestamp are reported stale so outdated data is never silently exported).format_prometheus()emits Prometheus exposition text (tot_health_check_status,tot_health_check_metric_stale,tot_health_check_metric_age_seconds).redact_secrets()redacts password/token/api-key/bearer values from exported diagnostic output.--prometheus/-pand--stale-thresholdCLI flags; astale_metricsarray is included in--jsonoutput. Default output remains unchanged.docs/OPERATIONS.md: documents the new metrics and the stale-metric guard.tests/test_health_check_stale_metrics.py: unit tests covering fresh/stale detection, threshold boundary, missing timestamps, Prometheus formatting, redaction, and default-output compatibility.diagnostic/build-109814a2.logd+.json: required diagnostic bundle.Testing
python3 tests/test_health_check_stale_metrics.py -v-> 12 tests pass (fresh/stale metrics, threshold boundary, Prometheus format, redaction, compatibility).python3 build.py-> diagnostic bundle generated and committed (diagnostic/build-109814a2.logd, 14583 bytes,DIAGmagic).format_prometheus()andflag_stale_metrics()with fresh and stale timestamps.Checklist
Addresses bounty issue #6. Please let me know the process for claiming the $25 bounty once merged.