Skip to content

docs: set KillMode=process on the dstack-vmm systemd unit so a daemon restart doesn't kill all CVMs #3462

@barakeinav1

Description

@barakeinav1

Background

The sample dstack-vmm.service unit in running-an-mpc-node-in-tdx-external-guide.md (the "VMM service persistence" section) sets no KillMode=, so it defaults to control-group. The qemu CVMs run as child processes inside the unit's cgroup, so any restart of dstack-vmm — maintenance, an upgrade, or needrestart bouncing the service after a shared-library/unattended-upgrades update — sends SIGTERM to every CVM on the host, taking all co-located nodes down at once (qemu-system-x86_64: terminating on signal 15 from pid 1 (systemd)).

Setting KillMode=process would make systemd kill only the daemon's main process on stop, leaving the running CVMs alive across a daemon restart.

Acceptance Criteria

  • Verify on a test host (or localnet) that with KillMode=process set, restarting dstack-vmm leaves the qemu CVMs running and the restarted daemon cleanly re-adopts them (reconnects to the existing supervisor, no VM restart).
  • If verified, add KillMode=process (with an explanatory comment) to the documented dstack-vmm.service unit in running-an-mpc-node-in-tdx-external-guide.md.
  • If KillMode=process does not cleanly re-adopt running CVMs, document the per-CVM systemd-scope alternative instead.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions