Skip to content

fix: retry partial writes in packet and push paths#22

Open
rmorgans wants to merge 3 commits intomobydeck:mainfrom
rmorgans:fix/write-all
Open

fix: retry partial writes in packet and push paths#22
rmorgans wants to merge 3 commits intomobydeck:mainfrom
rmorgans:fix/write-all

Conversation

@rmorgans
Copy link

Summary

  • Extract write_all() retry loop and use it in write_packet_or_fail, push_main, and send_kill
  • write() returning 0 (no progress) is treated as failure with errno=EIO
  • Includes deterministic fault-injection regression tests (preload_short_write.c)

What bug this fixes

The old code did single write() calls for packet-sized data and treated any short write as fatal. On a Unix domain socket under memory pressure or signal interruption, write() can return a short count. This caused spurious session disconnects.

How it's tested

An LD_PRELOAD / DYLD_INSERT_LIBRARIES shim (tests/preload_short_write.c) forces the first socket write to complete with only 1 byte. The test verifies that push and kill commands still succeed despite the short write. These tests fail on the current code and pass with the fix.

Test results

206/207 pass (1 pre-existing failure: test 93, macOS symlink issue). ASan+UBSan clean.

rmorgans and others added 3 commits March 14, 2026 22:51
Extract write_all() and use it everywhere a socket write must be
complete-or-fail. Fixes spurious disconnects under memory pressure
or signal interruption that returns a short count.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
send_kill() still used a bare write() for the kill packet.
Apply the same write_all() retry loop as push and attach paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- write_all: return -1 with errno=EIO when write() returns 0 (no
  progress), preventing an infinite retry loop
- Include fault injection tests (preload_short_write.c) that force
  short socket writes to deterministically verify the retry logic
  in push and kill paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rmorgans added a commit to rmorgans/atch that referenced this pull request Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant