Skip to content

mpc-node: handle SIGTERM for graceful shutdown on operator stop #3409

@barakeinav1

Description

@barakeinav1

Background

mpc-node does not install a SIGTERM handler today. When an operator stops the process via docker stop, kubectl delete, systemctl stop, or dstack's CVM stop command, the orchestrator sends SIGTERM first and then SIGKILL after a grace period (10s for Docker default, 30s for Kubernetes, 90s for systemd). Because we have no handler installed, SIGTERM has the same effect as SIGKILL — the OS terminates the process immediately, the embedded near-indexer thread is killed mid-write, and the next start can land on inconsistent RocksDB state.

This issue was surfaced while investigating docs/investigation/2121-back-migration-e2e-flake.md, where a CI test SIGKILLs mpc-node mid-flight and the next start panics ~65–80% of the time inside near-indexer (see docs/investigation/nearcore-indexer-sigkill-restart-panic.md). Production stops via dstack/Docker/Kubernetes/systemd take the same code path, so any production stop today carries the same restart-corruption risk as the test scenario.

User Story

As an operator stopping an MPC node via my orchestrator (dstack CVM stop / Docker / Kubernetes / systemd), I want the node to receive SIGTERM, finish in-flight commits, and exit cleanly within the grace period — so that the next start finds RocksDB in a consistent state and doesn't trip an indexer restart panic.

Acceptance Criteria

  • mpc-node installs a SIGTERM handler that routes the signal into the existing internal shutdown channel (shutdown_signal_sender), so the same tokio::select! arm that handles TEE image-hash shutdowns also handles SIGTERM.
  • After the main select! exits, near_async::shutdown_all_actors() is called so nearcore's actor system can commit any in-flight RocksDB batches before the process exits.
  • tracing::warn!("SIGTERM received, initiating graceful shutdown") is emitted when the signal arrives, so operators can confirm the path was taken.
  • Verified in CI: mpc-node exits gracefully (typically within 100 ms) after SIGTERM, vs SIGKILL fallback firing in the pre-fix state.

Resources & Additional Notes

  • The investigation that surfaced this issue and the test campaign data live in docs/investigation/2121-back-migration-e2e-flake.md. With a working handler, the e2e test passed 1/5 vs 0/5 without it on the same commit — confirming the handler is a real production improvement but does not fully close the upstream nearcore restart panic, which fires non-deterministically regardless of shutdown cleanliness.
  • The upstream nearcore bug is documented separately in docs/investigation/nearcore-indexer-sigkill-restart-panic.md. That issue needs to be fixed in nearcore; this issue is the orthogonal mpc-node-side fix that should land regardless.
  • We considered also calling near_store::db::RocksDB::block_until_all_instances_are_dropped() (which neard's standalone binary does after shutdown_all_actors), but it hangs indefinitely in our embedding because our indexer thread's block_on never returns — the spawned monitor tasks hold Arc<IndexerState>Arc<RocksDB> references that nothing currently cancels on shutdown. A proper fix for that hang would wire a CancellationToken through the indexer thread; out of scope for this issue.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions