chore(ci): add RabbitMQ healthcheck and CI wait step to prevent startup race condition #3569
chore(ci): add RabbitMQ healthcheck and CI wait step to prevent startup race condition #3569Sukuna0007Abhi wants to merge 2 commits intoaugurlabs:mainfrom
Conversation
…up race Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>
713e4a4 to
470ad45
Compare
…bitMQ; avoid -d Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>
58011a9 to
7f31b89
Compare
|
Thanks for the contribution. Did you mean to open this PR for #3548? If not, can you please link this PR to an issue? |
|
Actually @shlokgilda I found out this by exploring the failure cl on this https://github.com/chaoss/augur/actions/runs/20908242825/job/60065967943?pr=3534 So, I proposed first a fix(https://chaoss-workspace.slack.com/archives/C0226ELG6R4/p1768225953312839?thread_ts=1768225953.312839&cid=C0226ELG6R4) and do some better changes which is similar fixes to race condition pr #3548 but yeah it similar but not fully same with that issue #3548 |
|
i think this has more to do with augur trying to connect at a time when augur inexlicably is restarting the db? Stack TraceSo yeah likely the same underlying race condition fix. That said. The contents of this issue just adds a health check for rabbitmq too (assuming this health check works/is supported by documentation from rabbit). it wont fix the race condition but maybe its still useful? |
|
Superseded by #3613 |
As I proposed it first, https://chaoss-workspace.slack.com/archives/C0226ELG6R4/p1768225953312839?thread_ts=1768225953.312839&cid=C0226ELG6R4
Description
Added a Docker healthcheck for RabbitMQ( to finish syncing its mnesia table on first startup, which was causing some ci timeouts. this fix allows rabbitmq enough time to start up correctly and stops premature connection attempts.)
and a CI step that waits for RabbitMQ to be healthy before streaming logs / running E2E checks — prevents flaky E2E failures when services aren’t fully initialized. Found in (https://github.com/chaoss/augur/actions/runs/20908242825/job/60065967943?pr=3534)
Changes
Add healthcheck to rabbitmq in docker-compose.yml (uses rabbitmq-diagnostics ping).
Update build_docker.yml start step to start compose detached, poll RabbitMQ readiness, then stream logs into await_all.py (increased timeout).
Test
Locally: docker compose up --build and docker compose exec -T rabbitmq rabbitmq-diagnostics -q ping should succeed.
CI: E2E should wait for RabbitMQ and run reliably.
Notes for Reviewers
Signed commits