Skip to content

fix(activation): don't retry one-shot ACM activation calls on timeout#2743

Open
madhavilosetty-intel wants to merge 2 commits into
mainfrom
fix/upgrade-to-admin-one-shot-timeout
Open

fix(activation): don't retry one-shot ACM activation calls on timeout#2743
madhavilosetty-intel wants to merge 2 commits into
mainfrom
fix/upgrade-to-admin-one-shot-timeout

Conversation

@madhavilosetty-intel
Copy link
Copy Markdown
Contributor

On a successful ACM activation (AdminSetup) or CCM->ACM upgrade (UpgradeClientToAdmin), AMT transitions to admin mode and drops the session without sending a WSMAN response, so a gateway timeout is the expected outcome. invokeWsmanCall was treating that timeout as retryable (wsman_max_attempts floor) and re-issuing the non-idempotent call against a device already in ACM, producing HTTP 401 / connection resets and eventually an uncaught exception.

Add an opt-in oneShot flag to invokeWsmanCall that caps the call at a single attempt, and use it for sendAdminSetup and sendUpgradeClientToAdmin so the timeout propagates to the state machine, which waits and re-queries device status via CHECK_ACTIVATION_ON_AMT.

PR Checklist

  • Unit Tests have been added for new changes
  • API tests have been updated if applicable
  • All commented code has been removed
  • If you've added a dependency, you've ensured license is compatible with Apache 2.0 and clearly outlined the added dependency.

What are you changing?

Anything the reviewer should know when reviewing this PR?

If the there are associated PRs in other repositories, please link them here (i.e. device-management-toolkit/repo#365 )

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents non-idempotent ACM activation operations from being retried when AMT drops the session on a successful activation/upgrade (resulting in an expected gateway timeout). It adds a oneShot option to the shared WSMAN invocation helper so that these specific calls are capped to a single attempt and the timeout can propagate to the activation state machine, which then re-checks device status.

Changes:

  • Add an opt-in oneShot flag to invokeWsmanCall to cap retry attempts to 1 for one-shot operations.
  • Use oneShot for AdminSetup (ACM activation) and UpgradeClientToAdmin (CCM→ACM upgrade) WSMAN calls.
  • Update activation unit tests to assert the new invokeWsmanCall(..., oneShot=true) invocation for these operations.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/stateMachines/common.ts Adds oneShot flag to invokeWsmanCall to force a single attempt for non-idempotent activation/upgrade calls.
src/stateMachines/activation.ts Marks AdminSetup and UpgradeClientToAdmin as one-shot WSMAN calls to avoid retries on expected timeouts.
src/stateMachines/activation.test.ts Updates assertions to verify invokeWsmanCall is invoked with oneShot=true for AdminSetup/Upgrade.

Comment thread src/stateMachines/common.ts
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread src/stateMachines/common.ts Outdated
On a successful ACM activation (AdminSetup) or CCM->ACM upgrade
(UpgradeClientToAdmin), AMT transitions to admin mode and drops the session
without sending a WSMAN response, so a gateway timeout is the expected
outcome. invokeWsmanCall was treating that timeout as retryable
(wsman_max_attempts floor) and re-issuing the non-idempotent call against a
device already in ACM, producing HTTP 401 / connection resets and eventually
an uncaught exception.

Add an opt-in oneShot flag to invokeWsmanCall that caps the call at a single
attempt, and use it for sendAdminSetup and sendUpgradeClientToAdmin so the
timeout propagates to the state machine, which waits and re-queries device
status via CHECK_ACTIVATION_ON_AMT.
@madhavilosetty-intel madhavilosetty-intel force-pushed the fix/upgrade-to-admin-one-shot-timeout branch from 43b2340 to 8d0c1d6 Compare June 2, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants