Ensure proper cancellation of in-flight workflow tasks by thepatrickchin · Pull Request #1819 · NVIDIA/NeMo-Agent-Toolkit

thepatrickchin · 2026-03-23T08:42:01Z

Description

Fixes a bug where an in-flight workflow task was not cancelled when the user clicked "Stop Generating" and submitted a new prompt. This left the handler stuck, most visibly when an OAuth login window was blocked by popup blocker or closed by the user.

Closes #1818

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Bug Fixes
- Improved workflow cancellation: preempts and awaits prior runs, reliably clears running-task state, and suppresses completion and observability messages when a run is canceled.
- Refined async context manager behavior to avoid masking cancellations; it now raises only for unexpected unfinished runs when no external exception is being propagated.
Tests
- Added tests covering context-exit behavior, cancellation propagation, and workflow replacement handoff.

copy-pr-bot · 2026-03-23T08:42:06Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-23T08:42:24Z

Walkthrough

Enforces workflow schema presence, preempts and awaits any in-flight workflow task before starting a new one, makes workflow run cancellation-aware (avoids sending completion/observability on cancel), tightens done-callback clearing, and prevents Runner.aexit from masking propagated exceptions; adds async tests for Runner exit/cancellation behavior.

Changes

Cohort / File(s)	Summary
Workflow Task Cancellation & Lifecycle `packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`	`process_workflow_request` now raises `RuntimeError` if `_workflow_schema_type` is `None`, cancels and awaits any existing `_running_workflow_task` before starting a new one, unifies task-creation path, adjusts done-callback to clear `_running_workflow_task` only when the finished task matches the stored task and only removes conversation handler when no replacement exists. `_run_workflow` now tracks cancellation (sets local `_cancelled`, re-raises `asyncio.CancelledError`) and suppresses sending `RESPONSE_MESSAGE` and the observability trace when cancelled while still clearing `_pending_observability_trace`.
Context Manager Exception Handling `packages/nvidia_nat_core/src/nat/runtime/runner.py`	`Runner.__aexit__` now raises `ValueError` only if the runner is in a non-terminal state AND no exception is being propagated (`exc_type is None`), so external cancellations/exceptions are not remapped to `ValueError`.
Context Manager Behavior Tests `packages/nvidia_nat_core/tests/nat/runtime/test_runner.py`	Added three async pytest cases: (1) verify exiting `async with Runner(...)` without calling `result()` raises `ValueError`, (2) verify `asyncio.CancelledError` raised inside the context propagates, (3) exercise workflow-replacement handoff by cancelling an in-flight Runner and starting a second Runner on the same shared state to confirm cancellation propagation and successful subsequent completion.

Sequence Diagram(s)

mermaid
sequenceDiagram
rect rgba(135,206,250,0.5)
participant Client
end
rect rgba(144,238,144,0.5)
participant MessageHandler
end
rect rgba(255,222,173,0.5)
participant Worker
end
rect rgba(221,160,221,0.5)
participant WorkflowTask
end

Client->>MessageHandler: send workflow request
MessageHandler->>MessageHandler: validate `_workflow_schema_type`
alt existing `_running_workflow_task`
    MessageHandler->>WorkflowTask: cancel existing task
    WorkflowTask-->>MessageHandler: Cancelled/Error
    MessageHandler->>MessageHandler: await termination, clear `_running_workflow_task`
end
MessageHandler->>Worker: create/start new WorkflowTask
Worker->>WorkflowTask: run workflow (`_run_workflow`)
WorkflowTask-->>Worker: completes or raises `CancelledError`
Worker->>MessageHandler: done callback (clear if matches current task)
alt not cancelled
    MessageHandler->>Client: send RESPONSE_MESSAGE and observability trace
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: ensuring proper cancellation of in-flight workflow tasks to fix the OAuth-related workflow stuck issue.
Linked Issues check	✅ Passed	The PR comprehensively addresses issue `#1818` by implementing workflow task cancellation (message_handler.py), proper exception handling during exit (runner.py), and adding test coverage for cancellation scenarios.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fix the workflow cancellation issue: message_handler updates for task preemption, runner exit condition refinement, and corresponding test cases.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py (1)

449-500: ⚠️ Potential issue | 🟠 Major

Make the pending trace local to the workflow task.

If either awaited send in this finally raises, the reset at the end is skipped and the handler keeps the previous run's trace. The next run can then suppress its own trace or emit stale data.

💡 One way to contain the state to a single workflow run

-        _cancelled = False
+        _cancelled = False
+        pending_observability_trace: ResponseObservabilityTrace | None = None
         try:
             auth_callback = self._flow_handler.authenticate if self._flow_handler else None
             async with self._session_manager.session(user_id=self._user_id,
                                                      user_message_id=user_message_id,
                                                      conversation_id=conversation_id,
                                                      http_connection=self._socket,
                                                      user_input_callback=self.human_interaction_callback,
                                                      user_authentication_callback=auth_callback) as session:
                 self._session_manager._context.metadata._request.payload = self._user_message_payload
                 async for value in generate_streaming_response(payload,
                                                                session=session,
                                                                streaming=True,
                                                                step_adaptor=self._step_adaptor,
                                                                result_type=result_type,
                                                                output_type=output_type):

                     # Store observability trace to send after completion message
                     if isinstance(value, ResponseObservabilityTrace):
-                        if self._pending_observability_trace is None:
-                            self._pending_observability_trace = value
+                        if pending_observability_trace is None:
+                            pending_observability_trace = value
                         continue
@@
         finally:
             if not _cancelled:
                 await self.create_websocket_message(data_model=SystemResponseContent(),
                                                     message_type=WebSocketMessageType.RESPONSE_MESSAGE,
                                                     status=WebSocketMessageStatus.COMPLETE)

                 # Send observability trace after completion message
-                if self._pending_observability_trace is not None:
-                    await self.create_websocket_message(data_model=self._pending_observability_trace,
+                if pending_observability_trace is not None:
+                    await self.create_websocket_message(data_model=pending_observability_trace,
                                                         message_type=WebSocketMessageType.OBSERVABILITY_TRACE_MESSAGE)
-
-            self._pending_observability_trace = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`
around lines 449 - 500, The handler-level _pending_observability_trace must be
made local to the workflow run: at the start of the workflow (before
streaming/generating responses in generate_streaming_response) capture the
current self._pending_observability_trace into a local variable (e.g.
pending_observability_trace = self._pending_observability_trace) and immediately
clear the handler state (self._pending_observability_trace = None) so the
handler no longer holds stale state; then use that local
pending_observability_trace when sending the observability message in the
finally block (instead of referencing self._pending_observability_trace) so any
exceptions while awaiting create_websocket_message won't leave stale traces on
the handler object. Ensure references in this change target
_pending_observability_trace and the finally send that calls
create_websocket_message (and leave generate_streaming_response and
authenticate/session logic unchanged).

🧹 Nitpick comments (1)

packages/nvidia_nat_core/tests/nat/runtime/test_runner.py (1)
287-319: Please cover the actual workflow-replacement path too.

These cases only validate Runner.__aexit__. The regression in packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py happens when one live workflow task is cancelled and another starts immediately, so that handoff still needs a regression test.

As per coding guidelines, "Maintain >= 80% test coverage; add or update tests when introducing changes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/tests/nat/runtime/test_runner.py` around lines 287 -
319, Add a new async test that exercises the workflow-replacement path: use
WorkflowBuilder to create two distinct entry functions (or reuse entry_fn then
immediately start a second run), start a Runner for the first workflow, trigger
an asyncio.CancelledError inside that running workflow (so it is cancelled), and
immediately start a second Runner invocation to simulate handoff; assert that
the cancelled task’s CancelledError is not masked by a ValueError and that the
second workflow runs to completion (e.g., returns expected result or calls
result()), referencing Runner, WorkflowBuilder, entry_fn (or a second entry
function), ContextState, and ExporterManager to set up and verify the handoff
behavior tied to the message_handler regression.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 278-297: The current logic cancels the previous
self._running_workflow_task and immediately starts a new asyncio.create_task
pointing at self._run_workflow, but the cancelled task can still mutate
handler-scoped state on self and race with the replacement; to fix, either
(preferred) serialize run startup by awaiting the cancelled task to finish
before creating the new one (e.g., call self._running_workflow_task.cancel()
then await it/await with timeout or use asyncio.shield to observe completion) so
the old run cannot teardown shared state after the new one starts, or
(alternative) change _run_workflow to accept a per-run context object (pass a
new RunContext instance into _run_workflow) and ensure all per-run state is
stored on that object instead of self; update references to
_running_workflow_task, _run_workflow, _conversation_id, and the
_done_callback/_worker.remove_conversation_handler logic accordingly so they
operate on the serialized/isolated run rather than shared handler state.

---

Outside diff comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 449-500: The handler-level _pending_observability_trace must be
made local to the workflow run: at the start of the workflow (before
streaming/generating responses in generate_streaming_response) capture the
current self._pending_observability_trace into a local variable (e.g.
pending_observability_trace = self._pending_observability_trace) and immediately
clear the handler state (self._pending_observability_trace = None) so the
handler no longer holds stale state; then use that local
pending_observability_trace when sending the observability message in the
finally block (instead of referencing self._pending_observability_trace) so any
exceptions while awaiting create_websocket_message won't leave stale traces on
the handler object. Ensure references in this change target
_pending_observability_trace and the finally send that calls
create_websocket_message (and leave generate_streaming_response and
authenticate/session logic unchanged).

---

Nitpick comments:
In `@packages/nvidia_nat_core/tests/nat/runtime/test_runner.py`:
- Around line 287-319: Add a new async test that exercises the
workflow-replacement path: use WorkflowBuilder to create two distinct entry
functions (or reuse entry_fn then immediately start a second run), start a
Runner for the first workflow, trigger an asyncio.CancelledError inside that
running workflow (so it is cancelled), and immediately start a second Runner
invocation to simulate handoff; assert that the cancelled task’s CancelledError
is not masked by a ValueError and that the second workflow runs to completion
(e.g., returns expected result or calls result()), referencing Runner,
WorkflowBuilder, entry_fn (or a second entry function), ContextState, and
ExporterManager to set up and verify the handoff behavior tied to the
message_handler regression.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4cbaff47-8f18-4c01-99aa-218943804912

📥 Commits

Reviewing files that changed from the base of the PR and between 7629460 and b61f236.

📒 Files selected for processing (3)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py
packages/nvidia_nat_core/src/nat/runtime/runner.py
packages/nvidia_nat_core/tests/nat/runtime/test_runner.py

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py

coderabbitai

🧹 Nitpick comments (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py (1)

278-284: Consider adding debug logging for cancelled task cleanup.

The pattern of catching and suppressing exceptions during cleanup is intentional here. However, logging at debug level would aid troubleshooting and satisfy static analysis (S110/BLE001) without changing behavior.

♻️ Optional: Add debug logging for visibility

         if self._running_workflow_task is not None:
             self._running_workflow_task.cancel()
             try:
                 await self._running_workflow_task
-            except (asyncio.CancelledError, Exception):
-                pass
+            except asyncio.CancelledError:
+                logger.debug("Previous workflow task cancelled successfully")
+            except Exception:
+                logger.debug("Previous workflow task raised exception during cancellation", exc_info=True)
             self._running_workflow_task = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`
around lines 278 - 284, Add debug-level logging around the cancellation/cleanup
of self._running_workflow_task: before calling
self._running_workflow_task.cancel() log that the running workflow task is being
cancelled (include task repr/id), and inside the except (asyncio.CancelledError,
Exception) block log the caught exception at debug level (including exception
info) so cleanup still suppresses errors but is visible in logs; use the
component's existing logger (e.g., self._logger or the module logger) and keep
behavior unchanged while emitting these debug messages around the cancel(),
await self._running_workflow_task, and the exception handler.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 278-284: Add debug-level logging around the cancellation/cleanup
of self._running_workflow_task: before calling
self._running_workflow_task.cancel() log that the running workflow task is being
cancelled (include task repr/id), and inside the except (asyncio.CancelledError,
Exception) block log the caught exception at debug level (including exception
info) so cleanup still suppresses errors but is visible in logs; use the
component's existing logger (e.g., self._logger or the module logger) and keep
behavior unchanged while emitting these debug messages around the cancel(),
await self._running_workflow_task, and the exception handler.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0a00474d-41ca-447c-afbe-373bad03ec1f

📥 Commits

Reviewing files that changed from the base of the PR and between b61f236 and cd5a3b3.

📒 Files selected for processing (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py

- Cancel any in-flight workflow task when a new user message arrives so that a workflow stuck waiting on an OAuth redirect (e.g. blocked popup) can be cleanly replaced. - Guard Runner.__aexit__ from raising a secondary ValueError when an exception (e.g. CancelledError) is already propagating, - Suppress the COMPLETE/observability messages in _run_workflow when the task was cancelled. Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

- Only clear _running_workflow_task when the callback matches the current task - Move conversation handler removal outside the task-identity check so a cancelled task's callback still cleans up the handler when no new task was created (e.g. RuntimeError before asyncio.create_task) Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

Reset _pending_observability_trace unconditionally in the finally block so a stale trace from a cancelled task cannot persist into the next workflow execution Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

_done_callback closed over self._conversation_id directly, which can change if a new message arrives before the task completes. The callback could then remove the wrong conversation's handler registration. Capture the value into a local variable so each callback always cleans up the conversation it was created for. Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

Prevents a race where the old task's finally block mutates shared handler state (e.g. pending_observability_trace) concurrently with a new workflow run. Also suppress CancelledError in Runner.__aexit__ and add tests to cover both clean-exit and cancellation paths. Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py (1)

272-293: ⚠️ Potential issue | 🟠 Major

Stage the replacement request locally until the old task is gone.

Line 272 commits the new request onto handler-scoped fields before the previous task has exited. The old run still reads self._message_parent_id/self._conversation_id in create_websocket_message() and self._user_message_payload on Line 462, and its done-callback on Lines 288-293 can also remove the newly registered conversation handler if it completes in that window. Parse from local values first, then copy them onto self only after the cancel/await block.

💡 Suggested shape

         try:
-            self._initialize_workflow_request(user_message_as_validated_type)
-            message_content: typing.Any = await self._process_websocket_user_message(user_message_as_validated_type)
-
-            if self._workflow_schema_type is None:
+            workflow_schema_type = user_message_as_validated_type.schema_type
+            if workflow_schema_type is None:
                 raise RuntimeError("Workflow schema type is not initialized")
+            message_content: typing.Any = await self._process_websocket_user_message(
+                user_message_as_validated_type,
+                workflow_schema_type,
+            )

             if self._running_workflow_task is not None:
                 self._running_workflow_task.cancel()
                 try:
                     await self._running_workflow_task
                 except (asyncio.CancelledError, Exception):
                     pass
                 self._running_workflow_task = None
+
+            self._initialize_workflow_request(user_message_as_validated_type)

_process_websocket_user_message() would need to take workflow_schema_type as an argument so this parsing stays self-free.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`
around lines 272 - 293, The handler is assigning new request state onto self
(e.g., self._message_parent_id, self._conversation_id,
self._user_message_payload via _process_websocket_user_message) before the
previous _running_workflow_task has fully exited, which can cause races in
create_websocket_message and the _done_callback; fix by parsing/staging all
request-specific values into local variables first (call
_process_websocket_user_message with needed context like workflow_schema_type
instead of relying on self), then perform the cancel/await of
self._running_workflow_task, and only after the old task is cleared copy those
local variables onto self (and then register _done_callback and set
self._running_workflow_task). Ensure references to _message_parent_id,
_conversation_id, _user_message_payload, _process_websocket_user_message,
_running_workflow_task, _done_callback, _worker.get_conversation_handler, and
_worker.remove_conversation_handler are updated accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 493-504: The finally block may be cancelled while awaiting
create_websocket_message, leaving self._pending_observability_trace set across
requests; fix by wrapping the sends that reference
self._pending_observability_trace in an inner try/finally so that regardless of
cancellation you always clear self._pending_observability_trace in the inner
finally. Specifically, around the await self.create_websocket_message(...) calls
(the ones using SystemResponseContent() and self._pending_observability_trace
and WebSocketMessageType.*), add an inner try that performs the sends and an
inner finally that sets self._pending_observability_trace = None to guarantee
cleanup even if awaits are cancelled.

---

Duplicate comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 272-293: The handler is assigning new request state onto self
(e.g., self._message_parent_id, self._conversation_id,
self._user_message_payload via _process_websocket_user_message) before the
previous _running_workflow_task has fully exited, which can cause races in
create_websocket_message and the _done_callback; fix by parsing/staging all
request-specific values into local variables first (call
_process_websocket_user_message with needed context like workflow_schema_type
instead of relying on self), then perform the cancel/await of
self._running_workflow_task, and only after the old task is cleared copy those
local variables onto self (and then register _done_callback and set
self._running_workflow_task). Ensure references to _message_parent_id,
_conversation_id, _user_message_payload, _process_websocket_user_message,
_running_workflow_task, _done_callback, _worker.get_conversation_handler, and
_worker.remove_conversation_handler are updated accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b2bae0b0-1da3-4ece-be8d-ff4d624563dc

📥 Commits

Reviewing files that changed from the base of the PR and between cd5a3b3 and dc91e06.

📒 Files selected for processing (3)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py
packages/nvidia_nat_core/src/nat/runtime/runner.py
packages/nvidia_nat_core/tests/nat/runtime/test_runner.py

✅ Files skipped from review due to trivial changes (1)

packages/nvidia_nat_core/tests/nat/runtime/test_runner.py

🚧 Files skipped from review as they are similar to previous changes (1)

packages/nvidia_nat_core/src/nat/runtime/runner.py

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py

Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

coderabbitai

🧹 Nitpick comments (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py (1)

278-284: Consider debug-level logging for suppressed exceptions.

The silent pass is intentional for graceful cancellation cleanup, but logging at DEBUG level would aid troubleshooting without adding noise. Static analysis also flags this pattern (S110, BLE001).

♻️ Optional: Add debug logging

             if self._running_workflow_task is not None:
                 self._running_workflow_task.cancel()
                 try:
                     await self._running_workflow_task
-                except (asyncio.CancelledError, Exception):
-                    pass
+                except asyncio.CancelledError:
+                    logger.debug("Previous workflow task cancelled successfully")
+                except Exception:
+                    logger.debug("Previous workflow task raised exception during cancellation", exc_info=True)
                 self._running_workflow_task = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`
around lines 278 - 284, The except block that currently swallows CancelledError
and other Exception for self._running_workflow_task should log the caught
exception at debug level instead of silently passing; update the handler around
self._running_workflow_task.cancel() / await self._running_workflow_task to call
the module logger (or self.logger) with a debug message including the exception
info (e.g., logger.debug("workflow task cancelled/failed", exc_info=exc) or
similar) when catching asyncio.CancelledError and Exception so suppressed errors
are recorded for troubleshooting while keeping behavior unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py`:
- Around line 278-284: The except block that currently swallows CancelledError
and other Exception for self._running_workflow_task should log the caught
exception at debug level instead of silently passing; update the handler around
self._running_workflow_task.cancel() / await self._running_workflow_task to call
the module logger (or self.logger) with a debug message including the exception
info (e.g., logger.debug("workflow task cancelled/failed", exc_info=exc) or
similar) when catching asyncio.CancelledError and Exception so suppressed errors
are recorded for troubleshooting while keeping behavior unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: baa9419b-ef7f-4c5c-8234-504d966e24ad

📥 Commits

Reviewing files that changed from the base of the PR and between dc91e06 and f24dc5d.

📒 Files selected for processing (1)

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py

thepatrickchin requested a review from a team as a code owner March 23, 2026 08:42

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py Show resolved Hide resolved

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

thepatrickchin added 6 commits March 23, 2026 17:30

feat: add runner tests

60c4abd

Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

thepatrickchin force-pushed the fix/oauth-retry branch from cd5a3b3 to dc91e06 Compare March 23, 2026 09:41

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py Outdated Show resolved Hide resolved

fix: make pending observability trace cleanup cancellation-safe.

f24dc5d

Signed-off-by: Patrick Chin <8509935+thepatrickchin@users.noreply.github.com>

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure proper cancellation of in-flight workflow tasks#1819

Ensure proper cancellation of in-flight workflow tasks#1819
thepatrickchin wants to merge 7 commits intoNVIDIA:developfrom
thepatrickchin:fix/oauth-retry

thepatrickchin commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thepatrickchin commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thepatrickchin commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading