Skip to content

Surface conclude_node_migration failure cause in node logs #3368

@barakeinav1

Description

@barakeinav1

Background

While investigating #2121 (back-migration not completing), we found that when the contract rejects conclude_node_migration (e.g., with InvalidTeeRemoteAttestation), the node has no visibility into the rejection reason. The retry loop fires every ~60s indefinitely with no log line explaining why the conclude was rejected.

Three mechanics combine to produce this silent-failure mode:

  1. crates/node/src/indexer/tx_sender.rs:294-302observe_tx_result for ConcludeNodeMigration(_) returns TransactionStatus::Unknown with the comment "We don't care. The contract state change will handle this."
  2. crates/node/src/migration_service/onboarding.rs::send_conclude_onboarding uses bare tx_sender.send (not send_and_wait), which returns Ok the moment the TX is applied — even if the contract function rolled back.
  3. crates/node/src/indexer/handler.rs:313 matches ExecutionStatusView::SuccessReceiptId only; failed receipts are silently dropped.

The end-user-visible result during #2121 was "appears to succeed at the backup-cli and TLS layer, but the MPC contract state does not update," with manual intervention needed. The lack of a diagnostic log was a primary reason the bug was hard to triage — the operator and investigators had to read on-chain receipts directly to discover the rejection cause. PR #3362 fixes the most common root cause (stale on-chain attestation at conclude time), but other rejection paths from conclude_node_migration (e.g., KeysetMismatch, AccountPublicKeyMismatch, MigrationNotFound) would hit the same silent-failure mode.

User Story

As a node operator, when conclude_node_migration is rejected by the contract, I want to see the rejection reason in the node logs so that I can diagnose and respond without inspecting on-chain receipts directly.

Acceptance Criteria

  • When a ConcludeNodeMigration transaction is included in a block but the contract returns Err, the node emits a structured tracing::warn! (or higher) including the underlying error variant (e.g., InvalidTeeRemoteAttestation, KeysetMismatch).
  • The mechanism does not regress existing behavior for successful conclude transactions (no spurious warnings on success).
  • Pattern is generalizable so other "fire-and-forget" transactions (currently sharing the Unknown branch in observe_tx_resultStartKeygen, StartReshare, VotePk, VoteReshared, VoteAbortKeyEventInstance, VerifyTee, RegisterForeignChainConfig) can be migrated to surface their rejection causes too, even if migrating them is out of scope for this issue.

Resources & Additional Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions