fix(node): handle SIGTERM for graceful shutdown on operator stop by barakeinav1 · Pull Request #3410 · near/mpc

barakeinav1 · 2026-05-31T14:19:08Z

Summary

Installs a SIGTERM handler in mpc-node that routes the signal into the existing internal shutdown channel — operators stopping the node via dstack CVM stop / docker stop / kubectl delete / systemctl stop now get graceful shutdown semantics, where before SIGTERM was effectively SIGKILL.
After the main select! exits, calls near_async::shutdown_all_actors() so nearcore's actor system can commit any in-flight RocksDB batches before the process exits.

Closes #3409. Does NOT close the upstream nearcore restart panic at streamer/mod.rs:207 — see docs/investigation/nearcore-indexer-sigkill-restart-panic.md. Test campaign data on the investigation branch showed this handler gets back-migration e2e from 0/5 pass to 1/5 — a real production improvement, but not a full fix for the upstream non-determinism.

Notable non-action: no `RocksDB::block_until_all_instances_are_dropped()`

neard's standalone binary follows shutdown_all_actors() with RocksDB::block_until_all_instances_are_dropped(). We deliberately don't, because in our embedding it hangs indefinitely: the indexer runs in a std::thread::spawn'd closure whose block_on never returns — the spawned monitor_*/indexer_logger tasks hold Arc<IndexerState> → Arc<RocksDB> references that nothing currently cancels on shutdown. Calling block_until_all_instances_are_dropped would block the SIGTERM path past any reasonable grace period and the orchestrator would SIGKILL us anyway. A proper fix would wire a CancellationToken through the indexer thread; out of scope for this PR. The inline comment captures the rationale.

Test plan

CI green (cargo check / clippy / fmt on the new code).
Manual: run mpc-node locally against localnet, send kill -TERM <pid>, observe SIGTERM received, initiating graceful shutdown log followed by Stopping nearcore actor system. and a clean exit (typically <1s).
No regression: existing e2e tests pass (the should_handle_back_migration_a_to_b_to_a flake is pre-existing and tracks the upstream bug, not this handler).

mpc-node had no SIGTERM handler installed, so SIGTERM from container/orchestrator stop (dstack CVM stop, docker stop, kubectl delete, systemctl stop) was effectively SIGKILL — the OS terminated the process immediately, leaving the embedded near-indexer thread to be killed mid-write. The next start could then trip the nearcore restart panic documented in docs/investigation/. This installs a SIGTERM handler that routes into the existing shutdown_signal channel, so the same select! arm that handles TEE image-hash-initiated shutdowns also handles SIGTERM. After the main loop exits we call near_async::shutdown_all_actors() so nearcore's actor system can commit any in-flight RocksDB batches before we return. We deliberately do NOT call RocksDB::block_until_all_instances_are_dropped() (which neard's standalone binary does next) — our embedded indexer runs in a std::thread::spawn'd closure whose block_on never returns because spawned monitor tasks hold Arc<IndexerState> → Arc<RocksDB> refs that nothing cancels. Calling block_until_all_instances_are_dropped would hang the SIGTERM path past any reasonable grace period and end in the SIGKILL we were trying to avoid. RocksDB's WAL still guarantees committed data survives kill; this just closes a smaller flush window. See the inline comment. Closes #3409. Does NOT close the upstream nearcore restart panic at streamer/mod.rs:207 which fires non-deterministically regardless of shutdown cleanliness — see docs/investigation/nearcore-indexer-sigkill-restart-panic.md.

Copilot

Pull request overview

Installs a Unix SIGTERM handler in mpc-node that forwards SIGTERM into the existing internal shutdown channel so operator-initiated stops (Docker/Kubernetes/systemd/dstack) get the same graceful path used by TEE image-hash-driven shutdowns, and calls near_async::shutdown_all_actors() after the main select! exits to give nearcore's actor system a chance to flush in-flight RocksDB batches.

Changes:

Spawn a task on root_runtime that installs a SIGTERM handler and sends () on shutdown_signal_sender when received.
After graceful shutdown, invoke near_async::shutdown_all_actors() (with an inline rationale for skipping RocksDB::block_until_all_instances_are_dropped()).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

claude · 2026-05-31T14:21:01Z

Pull request overview

Adds a SIGTERM handler in mpc-node that routes the signal into the existing internal shutdown channel, so operator-initiated stops (dstack CVM stop, docker stop, kubectl delete, systemctl stop) get the same graceful path as TEE image-hash-initiated shutdowns. After the main select! exits, the function now also calls near_async::shutdown_all_actors() so nearcore's actor system can commit in-flight RocksDB batches. The deliberate omission of RocksDB::block_until_all_instances_are_dropped() is documented inline with the underlying reason (embedded indexer thread's block_on never returns, so it would hang past any reasonable grace period).

Changes:

Install a SIGTERM listener via tokio::signal::unix::signal(SignalKind::terminate()) and forward to shutdown_signal_sender (size-1 mpsc shared with the image-hash watcher).
Append near_async::shutdown_all_actors() after the image hash watcher join, before returning the exit reason.

Reviewed changes

Per-file summary

File	Description
crates/node/src/run.rs	Spawns a SIGTERM-listener task into `root_runtime` that sends `()` into the existing shutdown channel; calls `near_async::shutdown_all_actors()` after image-hash-watcher shutdown.

Findings

Blocking (must fix before merge):

None.

Non-blocking (nits, follow-ups, suggestions):

crates/node/src/run.rs:195 — The inline comment points readers to docs/investigation/nearcore-indexer-sigkill-restart-panic.md, and the PR description links to that same file on main. Neither this PR nor main actually contains it (find docs -name 'nearcore-indexer-sigkill*' is empty, no commit touches docs/investigation/). Either add the doc in this PR or drop the reference so future readers don't chase a broken link.
crates/node/src/run.rs:184-210 — No test exercises the new path; engineering-standards.md asks that every change have a test that would fail under revert. Signal delivery is awkward to unit-test, but the routing logic ("SIGTERM source -> shutdown_signal_sender -> select! exits with the watcher-shutdown error") could be covered by extracting the listener into a small helper that takes a Stream of signal events and a Sender, then driving it from a test. Worth at least an e2e smoke check that kill -TERM produces a clean exit with the new log lines (the PR's manual test plan is a fine substitute for one round, but won't catch future regressions).
crates/node/src/run.rs:199 — Only SIGTERM is wired up. SIGINT (Ctrl-C) and SIGHUP still default-terminate. Probably fine for the immediate motivation (container/operator stop), but worth a follow-up since the cost of adding them here is one extra signal() call and a tokio::select!.
crates/node/src/run.rs:203 — let _ = shutdown_for_sigterm.send(()).await; is the right call (after the main select! triggers, the receiver drops and any subsequent sender errors are expected), but a one-line note explaining that would save the next reader a moment.
crates/node/src/run.rs:198 — Spawning the handler on root_runtime after the indexer thread is already spawned introduces a small race window where SIGTERM arriving early in run_mpc_node still default-terminates the process. Moving the handler install to the top of the function (before spawn_real_indexer) closes that window. Minor — startup is fast — but trivially fixed.

✅ Approved

…tdown question The earlier docs had to hedge: SIGTERM experiments couldn't distinguish 'graceful shutdown doesn't help' from 'we never had a SIGTERM handler.' We've now closed that gap by landing a real SIGTERM handler (#3409 / #3410) and re-running the back-migration campaign. - 2121-back-migration-e2e-flake.md: new section 'Real SIGTERM handler in mpc-node — also does not fix it' covering iteration 1 (handler hangs on block_until_all_instances_are_dropped, 5/5 SIGKILL fallback) and iteration 2 (drop the block_until call, 1/5 pass with 100 ms graceful shutdowns in 5/5). Follow-up #5 updated — in flight via #3409 / #3410. - nearcore-indexer-sigkill-restart-panic.md: new 'Strongest evidence: graceful shutdown doesn't help' subsection inside 'What we ruled out'; table updated with the 5-run SIGTERM-handler row (4/5 fail at 100 ms); TL;DR + Workarounds + References updated; old SIGTERM-disclaimer removed since the question is now answered. Together the docs now say: same panic, same rate, regardless of whether shutdown was SIGKILL or a verified 100 ms graceful actor-system stop. The fix has to be in nearcore.

Copilot AI review requested due to automatic review settings May 31, 2026 14:19

Copilot started reviewing on behalf of barakeinav1 May 31, 2026 14:19 View session

Copilot AI reviewed May 31, 2026

View reviewed changes

barakeinav1 mentioned this pull request May 31, 2026

mpc-node: handle SIGTERM for graceful shutdown on operator stop #3409

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(node): handle SIGTERM for graceful shutdown on operator stop#3410

fix(node): handle SIGTERM for graceful shutdown on operator stop#3410
barakeinav1 wants to merge 1 commit into
mainfrom
barak/3409-mpc-node-sigterm-handler

barakeinav1 commented May 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

claude Bot commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

barakeinav1 commented May 31, 2026

Summary

Notable non-action: no RocksDB::block_until_all_instances_are_dropped()

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

claude Bot commented May 31, 2026

Pull request overview

Reviewed changes

Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Notable non-action: no `RocksDB::block_until_all_instances_are_dropped()`