feat: add graceful loop device cleanup on Ctrl+C interrupt#583
feat: add graceful loop device cleanup on Ctrl+C interrupt#583rpandya28 wants to merge 9 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to make raw image builds more resilient by cleaning up loop devices when the process is interrupted (Ctrl+C / SIGTERM), and by making loop-device teardown more robust (swapoff, unmounts, and detach fallbacks).
Changes:
- Add SIGINT/SIGTERM handling in
RawMaker.BuildRawImage()to detach the loop device on interrupt. - Enhance
LoopDev.LoopSetupDelete()to disable swap, unmount mounted partitions, and handle “already detached” cases.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
internal/image/rawmaker/rawmaker.go |
Adds signal handling + goroutine-triggered cleanup for loop devices during raw image builds. |
internal/image/imagedisc/loopdev.go |
Expands loop device cleanup to include swapoff + unmounts, and adds detach fallbacks. |
| defer func() { | ||
| signal.Stop(sigChan) | ||
| close(sigChan) | ||
| if loopDevPath != "" { | ||
| if detachErr := rawMaker.LoopDev.LoopSetupDelete(loopDevPath); detachErr != nil { |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Replace the pattern of closing sigChan (the os.Signal channel) to signal
goroutine completion with a dedicated done channel and a select statement.
Previously the defer called close(sigChan) which would unblock the
goroutine's channel receive — relying on the ok check to distinguish a
real signal from normal completion. This conflates two concerns onto one
channel and is confusing.
With the new approach:
- sigChan is only ever written by the OS signal runtime (never closed)
- done is a separate struct channel closed by defer on return
- select{} in the goroutine makes the two paths explicit:
case sig := <-sigChan → real interrupt, cleanup + os.Exit(1)
case <-done → normal/error return, defer handles cleanup
| // Goroutine waits for either an OS signal or normal completion. | ||
| // Using select + a dedicated done channel avoids closing sigChan and | ||
| // makes the two paths explicit: real interrupt vs. function returned. | ||
| go func() { |
There was a problem hiding this comment.
This goroutine calls os.Exit(1), the entire process terminates immediately. os.Exit does not run any deferred functions, not the defer in BuildRawImage at rawmaker.go:145, not defers in the provider, not defers in main(). So chroot mounts, temp directories, and any other resources managed higher in the stack are leaked.
The solution is to not call os.Exit from library code and instead move signal handling up to the cmd level where it can orchestrate full cleanup.
| // Goroutine waits for either an OS signal or normal completion. | ||
| // Using select + a dedicated done channel avoids closing sigChan and | ||
| // makes the two paths explicit: real interrupt vs. function returned. | ||
| go func() { |
There was a problem hiding this comment.
Race condition: imagine a Ctrl+C arrives while the build is finishing normally. The signal gets queued in sigChan. Then the defer runs, it calls signal.Stop and close(done). At this point the goroutine's select sees two ready cases: the queued signal in sigChan AND the closed done channel. Go picks one at random. If it picks the signal case, both the goroutine and the defer end up calling LoopSetupDelete on the same loop device at the same time. Wrap the cleanup in a sync.Once so only one of them actually runs it.
| switch v := data.(type) { | ||
| case map[string]interface{}: | ||
| // Check if this entry has a mountpoint | ||
| if mountPoint, ok := v["mountpoint"].(string); ok && mountPoint != "" && mountPoint != "null" { |
There was a problem hiding this comment.
After json.Unmarshal, a JSON null becomes Go nil, not the string "null". The .(string) type assertion already filters out nil values, so this comparison mountPoint != "null" is dead code.
|
Need to add unit test for new functions. At minimum for |
- Remove os.Exit() from library code, replace with error propagation - Add sync.Once to prevent race condition on cleanup - Remove dead code (mountPoint != "null" check) - Add shell escaping for mount points with strconv.Quote() - Add 14 comprehensive unit tests for findAndUnmount and findAndDisableSwap - Use context.WithCancel() instead of closing channels Addresses review comments: - Fixes issue where signal handler could bypass deferred cleanup - Prevents simultaneous cleanup calls from goroutine and defer - Ensures proper resource cleanup through defer chain on signal interrupt - Validates all partition unmounting and swap disabling operations
Merge Checklist
All boxes should be checked before merging the PR
Description
Any Newly Introduced Dependencies
How Has This Been Tested?