Concurrent matrix jobs on the shared self-hosted demeter* runners intermittently fail at the Codecov step (codecov/codecov-action) with:
error: could not lock config file /home/<user>/.gitconfig: File exists
##[error]Process completed with exit code 255
Root cause: those runners share a single $HOME (hence one ~/.gitconfig). The Codecov action writes git config during setup; when several jobs hit that step concurrently they race for the gitconfig lock and the loser dies. The Julia tests themselves pass — the job fails only on this setup step, and fail_ci_if_error: false does not help (the crash is before the upload). It is flaky (re-runs / non-overlapping schedules go green) and reproduces independent of Julia version / test group.
This is a runner-fleet issue, not per-repo: it can hit any repo running concurrent jobs on these runners.
Fix: give each job an isolated HOME (or set GIT_CONFIG_GLOBAL=$RUNNER_TEMP/gitconfig per job) in the self-hosted runner provisioning, so concurrent git-config writes can't collide. Worth fixing in the shared runner setup rather than working around per-repo.
Surfaced during the SciML CI-centralization work (originally seen on Integrals.jl PR CI).
Concurrent matrix jobs on the shared self-hosted
demeter*runners intermittently fail at the Codecov step (codecov/codecov-action) with:Root cause: those runners share a single
$HOME(hence one~/.gitconfig). The Codecov action writes git config during setup; when several jobs hit that step concurrently they race for the gitconfig lock and the loser dies. The Julia tests themselves pass — the job fails only on this setup step, andfail_ci_if_error: falsedoes not help (the crash is before the upload). It is flaky (re-runs / non-overlapping schedules go green) and reproduces independent of Julia version / test group.This is a runner-fleet issue, not per-repo: it can hit any repo running concurrent jobs on these runners.
Fix: give each job an isolated
HOME(or setGIT_CONFIG_GLOBAL=$RUNNER_TEMP/gitconfigper job) in the self-hosted runner provisioning, so concurrent git-config writes can't collide. Worth fixing in the shared runner setup rather than working around per-repo.Surfaced during the SciML CI-centralization work (originally seen on Integrals.jl PR CI).