fix(duckdb): stamp silence-watcher clock in UTC so ages survive TZ#199
Open
sronix wants to merge 1 commit into
Open
fix(duckdb): stamp silence-watcher clock in UTC so ages survive TZ#199sronix wants to merge 1 commit into
sronix wants to merge 1 commit into
Conversation
Every timestamp the watcher compares against is stamped UTC-naive by its writers, but check_silence took its clock from naive-local datetime.now(), which only agrees while the container TZ is UTC (the python-slim default). Under TZ=Europe/Berlin every liveness age would inflate by 1-2 h against the 3 h threshold and fire false module-down Discord alerts on each 15-minute tick, while a TZ behind UTC would suppress real ones; the re-alert spacing and the last_silence_alert_at write-back skew the same way. The new regression test flips the process TZ to a fixed UTC+2 zone via time.tzset() and pins that a module seen 1.5 h ago stays quiet. The NOW() and DEFAULT CURRENT_TIMESTAMP writers on module_configs carry the residual naive-local risk and stay as a follow-up (see the chapter-11 entry this change adds).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
duckdb-service/services/silence_watcher.py:check_silencenow takes its clock fromdatetime.now(timezone.utc).replace(tzinfo=None)instead of naive-localdatetime.now(), mirroring thereceived_atwriter inroutes/heartbeats.py.tests/test_silence_watcher.py): flips the process TZ to a fixed UTC+2 zone (Etc/GMT-2+time.tzset(), POSIX-guarded) and pins that a module seen 1.5 h ago stays quiet, a 4 h-silent module still alerts, andlast_silence_alert_atlands UTC-naive; plus recovery-fires-once and re-alert-suppression baselines.conftest.pypurgesservices.silence_watcheralongside the other service modules so each test binds the fresh DB.record_image's comment promised ("chapter-11 entry to follow"), and that comment now points at it. ADR-005's foursilence_watcher.py:NNcitations converted to symbol style since this diff shifted them.Why
Every timestamp the watcher compares against is stamped UTC-naive by its writers, but the watcher's clock was container-local, which agrees only while the container TZ is UTC (the python-slim default). A
TZ=Europe/Berlinoverride would inflate every liveness age 1-2 h against the 3 h threshold and fire false module-down Discord alerts on each 15-minute tick; a TZ behind UTC would suppress real outages. Reader-side twin of the chapter-11image_uploads.uploaded_atincident. Acknowledged follow-up, deliberately not in this diff:add_module'slast_seen_at = NOW()and the schema'sDEFAULT CURRENT_TIMESTAMPcolumns remain container-local timestamp sources (tracked in the new chapter-11 entry).How tested
pio test -e native)pytest tests/e2e)237 passed locally (232 on
main+ new) on Python 3.14. The regression test fails against the unfixed watcher: false "down" alert for a module seen 90 min ago under UTC+2, andlast_silence_alert_atlands 7200 s off UTC.scripts/check-doc-citations.sh: 3 OK, 0 problems. ruff check + format clean. Other suites untouched by this change.Checklist