FOD drv/output bind-mount patching leaks /tmp/<hash>.patched.<rand> mounts on build interruption → GC aborts with EBUSY

## Summary

Determinate Nix 3.17.3 (Nix 2.33.3) leaks bind mounts from its FOD drv/output patching mechanism when builds are interrupted (SIGTERM / cancellation). The leaked mounts persist in the host mount namespace across daemon restarts and eventually break `nix-collect-garbage` entirely.

On one CI-runner host I observed **793 leaked mountpoints** all dated ~3 days old, which silently broke every subsequent GC — `nix-collect-garbage` aborts on the first EBUSY `unlink()` with `0 store paths deleted, 0.0 KiB freed`. Root filesystem filled to 100% (1.7T) as a result.

## Observed pattern

For every affected store path, `/proc/1/mountinfo` shows:

```
<mid> <pid> 254:0 /tmp/<storeHash>-<name>.patched.<rand> /nix/store/<storeHash>-<name> rw,relatime shared:1 - ext4 /dev/mapper/vg0-root0 rw,stripe=32
```

Example:

```
/tmp/m7i135y30si5vafq4n3q5489xmcslcfm-pnpm-install.drv.patched.liPdju
  → /nix/store/m7i135y30si5vafq4n3q5489xmcslcfm-pnpm-install.drv
/tmp/94m2xm7qgp1mkmv04rwdrk7qh46rhqnr-pnpm-install.patched.M4wFNa
  → /nix/store/94m2xm7qgp1mkmv04rwdrk7qh46rhqnr-pnpm-install
```

Both the `.drv` file itself and the FOD output directory get the bind-mount treatment.

## GC failure

```console
$ nix-collect-garbage -d
finding garbage collector roots...
deleting garbage...
deleting '/nix/store/m7i135y30si5vafq4n3q5489xmcslcfm-pnpm-install.drv'
error: cannot unlink "/nix/store/m7i135y30si5vafq4n3q5489xmcslcfm-pnpm-install.drv": Device or resource busy
0 store paths deleted, 0.0 KiB freed
```

GC aborts on the first EBUSY rather than skipping-and-continuing, so a single stale mount zeroes out GC progress indefinitely. Combined with `nix.settings.min-free` / `max-free` inline GC (which silently no-ops for the same reason), disk usage grows unbounded until builds start failing on ENOSPC.

## Suspected trigger

CI runner with GitHub Actions `concurrency.cancel-in-progress: true` — cancellations SIGTERM nix builds mid-FOD-patch, and the cleanup path for the `.patched` bind mount doesn't run.

## Workaround

```bash
grep -E 'patched\.' /proc/1/mountinfo | awk '{print $5}' | sort -u \
  | xargs -n1 -P4 sudo umount -l
nix-collect-garbage -d
```

After unmounting all 793 stale mounts, GC proceeded normally and freed 715 GiB (227,226 paths). No active builds were disrupted (the two "live"-looking mounts also turned out to be 3 days old and equally stale).

## Environment

- Determinate Nix 3.17.3 (Nix 2.33.3)
- NixOS x86_64, kernel 6.18.13
- Workload: GitHub Actions self-hosted runner building `pnpm-install`-style FODs under heavy concurrency with frequent cancellation

## Suggested fixes

1. **Make GC resilient** to EBUSY on unlink — log-and-skip instead of abort-the-run. A single leaked mount should not zero GC.
2. **Reap stale `.patched.*` mounts** on daemon startup (clear ones whose source `/tmp/...patched.<rand>` is older than some threshold and whose target store path isn't owned by a live build).
3. **Install SIGTERM/cleanup handlers** around the FOD patching bind-mount so it unmounts on abnormal build termination.

Happy to provide more data if useful.


<details>
<summary>Posted on behalf of @schickling</summary>

| field | value |
| --- | --- |
| `agent_name` | 👁️ cl1-iris |
| `agent_session_id` | 420ca8a2-8003-42d7-a440-a7cd4d317076 |
| `agent_tool` | Claude Code |
| `agent_tool_version` | 2.1.118 (Claude Code) |
| `agent_runtime` | Claude Code 2.1.118 (Claude Code) |
| `agent_model` | claude-opus-4-7 |
| `worktree` | dotfiles/main |
| `machine` | dev3 |
| `tooling_profile` | dotfiles@f937ca8-dirty |
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FOD drv/output bind-mount patching leaks /tmp/<hash>.patched.<rand> mounts on build interruption → GC aborts with EBUSY #178

Summary

Observed pattern

GC failure

Suspected trigger

Workaround

Environment

Suggested fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

field	value
`agent_name`	👁️ cl1-iris
`agent_session_id`	420ca8a2-8003-42d7-a440-a7cd4d317076
`agent_tool`	Claude Code
`agent_tool_version`	2.1.118 (Claude Code)
`agent_runtime`	Claude Code 2.1.118 (Claude Code)
`agent_model`	claude-opus-4-7
`worktree`	dotfiles/main
`machine`	dev3
`tooling_profile`	dotfiles@f937ca8-dirty

FOD drv/output bind-mount patching leaks /tmp/<hash>.patched.<rand> mounts on build interruption → GC aborts with EBUSY #178

Description

Summary

Observed pattern

GC failure

Suspected trigger

Workaround

Environment

Suggested fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions