Background
The sample dstack-vmm.service unit in running-an-mpc-node-in-tdx-external-guide.md (the "VMM service persistence" section) sets no KillMode=, so it defaults to control-group. The qemu CVMs run as child processes inside the unit's cgroup, so any restart of dstack-vmm — maintenance, an upgrade, or needrestart bouncing the service after a shared-library/unattended-upgrades update — sends SIGTERM to every CVM on the host, taking all co-located nodes down at once (qemu-system-x86_64: terminating on signal 15 from pid 1 (systemd)).
Setting KillMode=process would make systemd kill only the daemon's main process on stop, leaving the running CVMs alive across a daemon restart.
Acceptance Criteria
- Verify on a test host (or localnet) that with
KillMode=process set, restarting dstack-vmm leaves the qemu CVMs running and the restarted daemon cleanly re-adopts them (reconnects to the existing supervisor, no VM restart).
- If verified, add
KillMode=process (with an explanatory comment) to the documented dstack-vmm.service unit in running-an-mpc-node-in-tdx-external-guide.md.
- If
KillMode=process does not cleanly re-adopt running CVMs, document the per-CVM systemd-scope alternative instead.
Background
The sample
dstack-vmm.serviceunit inrunning-an-mpc-node-in-tdx-external-guide.md(the "VMM service persistence" section) sets noKillMode=, so it defaults tocontrol-group. The qemu CVMs run as child processes inside the unit's cgroup, so any restart ofdstack-vmm— maintenance, an upgrade, orneedrestartbouncing the service after a shared-library/unattended-upgradesupdate — sends SIGTERM to every CVM on the host, taking all co-located nodes down at once (qemu-system-x86_64: terminating on signal 15 from pid 1 (systemd)).Setting
KillMode=processwould make systemd kill only the daemon's main process on stop, leaving the running CVMs alive across a daemon restart.Acceptance Criteria
KillMode=processset, restartingdstack-vmmleaves the qemu CVMs running and the restarted daemon cleanly re-adopts them (reconnects to the existing supervisor, no VM restart).KillMode=process(with an explanatory comment) to the documenteddstack-vmm.serviceunit inrunning-an-mpc-node-in-tdx-external-guide.md.KillMode=processdoes not cleanly re-adopt running CVMs, document the per-CVM systemd-scope alternative instead.