Background
The migration web server's import_keyshares channel receiver is owned exclusively by the onboard() loop (crates/node/src/migration_service/onboarding.rs:36). When that loop exits with OnboardingJob::Done (line 55) — which happens immediately at startup for any node that is already an active participant — the receiver is dropped while the web server task keeps running with a dangling sender.
After that point, any PUT /set_keyshares to the node returns 500 keyshares receiver channel is closed (crates/node/src/migration_service/web/server.rs:185-191). The node must be restarted to re-enter the onboarding loop and recreate the channel.
This surfaced when drafting a back-migration test that does not kill+restart A0 between forward and back directions (PR #3388): A0 cannot accept the back-migration PUT until its process restarts, because A0 was already an active participant at startup and onboard() returned before any migration was initiated.
User Story
As an operator running a back-migration (B → A) onto a node A that is currently running and has previously been an active participant, I want the migration to succeed without restarting A — or, if a restart is required by design, for that requirement to be documented and the error message to say so.
Acceptance Criteria
Resources & Additional Notes
Background
The migration web server's
import_keyshareschannel receiver is owned exclusively by theonboard()loop (crates/node/src/migration_service/onboarding.rs:36). When that loop exits withOnboardingJob::Done(line 55) — which happens immediately at startup for any node that is already an active participant — the receiver is dropped while the web server task keeps running with a dangling sender.After that point, any
PUT /set_keysharesto the node returns500 keyshares receiver channel is closed(crates/node/src/migration_service/web/server.rs:185-191). The node must be restarted to re-enter the onboarding loop and recreate the channel.This surfaced when drafting a back-migration test that does not kill+restart A0 between forward and back directions (PR #3388): A0 cannot accept the back-migration PUT until its process restarts, because A0 was already an active participant at startup and
onboard()returned before any migration was initiated.User Story
As an operator running a back-migration (B → A) onto a node A that is currently running and has previously been an active participant, I want the migration to succeed without restarting A — or, if a restart is required by design, for that requirement to be documented and the error message to say so.
Acceptance Criteria
onboard()), or re-enteronboard()when the contract transitions back into a state where this node is an idle migration target.Resources & Additional Notes
crates/node/src/migration_service.rs:42— channel createdcrates/node/src/migration_service/onboarding.rs:45-87— loop that exits onDonecrates/node/src/migration_service/web/server.rs:185-191— PUT handler that 500s when the receiver is gonecrates/node/src/migration_service/types.rs:78-101—OnboardingJob::new(Active(Active) → Done)crates/node/src/run.rs:335— startup site that.awaitsspawn_recovery_server_and_run_onboarding