Skip to content

Improve handling of failed scan tasks and allow reconnect#332

Merged
GDYendell merged 2 commits intomainfrom
reconnect-handling
Mar 9, 2026
Merged

Improve handling of failed scan tasks and allow reconnect#332
GDYendell merged 2 commits intomainfrom
reconnect-handling

Conversation

@GDYendell
Copy link
Contributor

@GDYendell GDYendell commented Mar 4, 2026

Summary by CodeRabbit

  • Refactoring

    • Introduced a data-driven ControllerAPI and simplified how controller APIs and scan tasks are constructed and wired.
  • Bug Fixes

    • Connection close is idempotent and tolerates connection reset errors.
    • Failing scan tasks mark a controller as disconnected and pause periodic scanning until recovery.
  • New Features

    • Added reconnect operation to restore connectivity without restart.
    • Added command and scan decorators for controller methods.
  • Tests

    • Expanded tests for connection closing and scan/reconnect behavior.

@GDYendell GDYendell requested a review from shihab-dls March 4, 2026 17:37
@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Controllers now construct and expose a ControllerAPI tree and provide create_api_and_tasks() returning (ControllerAPI, periodic_coros, initial_coros). Transport-layer ControllerAPI was removed; modules now import ControllerAPI from fastcs.controllers. Added _connected-driven scan pause/resume, attribute callback Coroutine typing, and resilient IPConnection.close().

Changes

Cohort / File(s) Summary
Attribute types & callbacks
src/fastcs/attributes/attr_r.py
Changed update-callback type aliases to Coroutine[None, None, None] and removed the try/except around awaiting update callbacks.
Connection resilience & tests
src/fastcs/connections/ip_connection.py, tests/test_ip_connection.py
IPConnection.close() now returns early if no connection, swallows ConnectionResetError, always clears the connection; added tests covering close behaviors.
Controller API datamodel
src/fastcs/controllers/controller_api.py, src/fastcs/controllers/__init__.py
Added ControllerAPI dataclass (path, attributes, command_methods, scan_methods, sub_apis, description) with walk_api() and __repr__; exported from controllers.
Controller core & scanning
src/fastcs/controllers/base_controller.py, src/fastcs/controllers/controller.py
Added BaseController._build_api(); Controller gains _connected, create_api_and_tasks(), and _create_periodic_scan_coro() to separate initial vs periodic coroutines and handle disconnect/reconnect logic.
Transport-layer removal & ripples
src/fastcs/transports/controller_api.py, src/fastcs/transports/__init__.py
Removed transport-layer ControllerAPI implementation and its orchestration helpers; removed its export from transports.
Import updates across transports
src/fastcs/transports/..., src/fastcs/transports/epics/..., src/fastcs/transports/graphql/..., src/fastcs/transports/rest/..., src/fastcs/transports/tango/...
Updated many modules to import ControllerAPI from fastcs.controllers instead of fastcs.transports.controller_api.
ControlSystem & API construction
src/fastcs/control_system.py, tests/conftest.py, tests/assertable_controller.py
Removed build_controller_api helpers; serve() now calls controller.create_api_and_tasks(); tests updated to use controller._build_api([]) or create_api_and_tasks() and reference _connected.
Demo & test adjustments
src/fastcs/demo/controllers.py, tests/...
Added reconnect() to demo controller; tests modified for new API construction, _connected flag, and scan/reconnect behavior; added/updated controller-related tests and decorators in tests.

Sequence Diagram

sequenceDiagram
    participant CS as ControlSystem
    participant C as Controller
    participant API as ControllerAPI
    participant Scheduler as Scheduler
    participant Attr as AttrR

    CS->>C: create_api_and_tasks()
    C->>C: _build_api(path)
    C->>API: construct ControllerAPI tree
    C-->>CS: (ControllerAPI, periodic_coros, initial_coros)

    CS->>CS: await initial_coros
    CS->>Scheduler: start periodic_coros

    loop periodic
        Scheduler->>C: check _connected
        alt connected
            Scheduler->>C: run scan methods
            C->>Attr: trigger attribute update callbacks (await coroutine)
        else disconnected
            Scheduler-->>Scheduler: sleep/backoff until reconnect
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped the API from transport to core,
Paths and scans now grow and softly soar.
Coroutines sing where Awaitables stood,
Sockets close gently, reconnects work good.
A rabbit cheers as paused scans wake once more.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Improve handling of failed scan tasks and allow reconnect' accurately reflects the main changes: enhanced error handling for scan task failures and new reconnect functionality for controllers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch reconnect-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
tests/conftest.py (1)

50-52: Consider using the public API pathway if available.

The fixture now calls the private _build_api([]) method directly. While acceptable for test infrastructure, relying on private methods can make tests fragile. If a public create_api_and_tasks() or similar method is intended for external use, consider whether the fixture should use that instead for better alignment with the public contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/conftest.py` around lines 50 - 52, The fixture controller_api currently
calls the private helper controller._build_api([]); change it to use the
controller's public factory method (e.g. create_api_and_tasks or another
documented public API-building method) if one exists so tests exercise the
public contract instead of private internals; locate the controller_api fixture
and replace controller._build_api([]) with the appropriate public call (passing
equivalent arguments) to construct the API and tasks.
tests/test_controllers.py (1)

260-283: Consider using connect() instead of directly setting _connected, and improve task cleanup.

  1. Line 271: Prefer calling await controller.connect() instead of manually setting controller._connected = True. This better reflects the intended usage pattern and tests the public API.

  2. Lines 282-283: The task cancellation should handle CancelledError to avoid potential issues:

♻️ Proposed improvements
 `@pytest.mark.asyncio`
 async def test_scan_exception_sets_disconnected_and_reconnect_resumes():
     class MyTestController(Controller):
         `@scan`(0.01)
         async def failing_scan(self):
             raise RuntimeError("scan error")

     controller = MyTestController()
     controller.post_initialise()
     _, scan_coros, _ = controller.create_api_and_tasks()

-    controller._connected = True
+    await controller.connect()
     task = asyncio.create_task(scan_coros[0]())

     # Wait long enough for the scan to run and raise, setting _connected = False
     await asyncio.sleep(0.1)
     assert not controller._connected

     # Trigger reconnect - _connected resumes scan tasks
     await controller.reconnect()
     assert controller._connected

     task.cancel()
-    await asyncio.sleep(0.01)
+    try:
+        await task
+    except asyncio.CancelledError:
+        pass
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_controllers.py` around lines 260 - 283, The test sets controller
state directly and doesn't properly await task cancellation; change the setup to
call and await controller.connect() (instead of setting controller._connected =
True) so the public API is exercised (see MyTestController, connect, reconnect,
scan decorator and scan_coros usage), and when cancelling the background task
created from scan_coros[0]() ensure you await it and handle
asyncio.CancelledError to cleanly swallow cancellation (i.e., cancel the task,
await it inside a try/except catching CancelledError) so the test teardown is
robust.
src/fastcs/controllers/controller.py (1)

55-57: Make base disconnect() clear connection state.

Base disconnect() currently leaves _connected unchanged. Resetting it to False makes lifecycle semantics safer for subclasses that don’t override disconnect().

🔧 Suggested change
     async def disconnect(self) -> None:
         """Hook to tidy up resources before stopping the application"""
-        pass
+        self._connected = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fastcs/controllers/controller.py` around lines 55 - 57, The base async
disconnect() leaves the controller's connection flag unchanged; update
Controller.disconnect() to set self._connected = False to reliably clear
connection state for subclasses that do not override disconnect(), ensuring
lifecycle semantics are correct (modify the async def disconnect(self) method in
the Controller class to assign self._connected = False before returning).
tests/assertable_controller.py (1)

79-79: Prefer public API construction in this test helper.

Line 79 uses controller._build_api([]), which binds tests to a private method. Using create_api_and_tasks() keeps this helper aligned with supported API boundaries.

♻️ Suggested change
-        controller_api = controller._build_api([])
+        controller_api, _, _ = controller.create_api_and_tasks()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/assertable_controller.py` at line 79, The test helper currently calls
the private method controller._build_api([]); replace that with the public
factory create_api_and_tasks() to avoid depending on a private symbol. Call
create_api_and_tasks() (or controller.create_api_and_tasks() if it is a method)
and unpack its return (e.g., api, tasks = create_api_and_tasks(...)) or assign
the returned API to controller_api so the helper uses the supported public API
surface instead of _build_api.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fastcs/connections/ip_connection.py`:
- Around line 86-92: The current close sequence only clears self.__connection
when connection.close() succeeds or raises ConnectionResetError; move the
clearing into a finally block so self.__connection is set to None regardless of
what exception close() raises. Concretely, wrap the await connection.close()
call in try/except/finally (or use try/finally) inside the async with
self._connection context, catch/handle specific ConnectionResetError if desired,
and always assign self.__connection = None in the finally so the stale handle is
never retained; reference the symbols self._connection, connection.close(), and
self.__connection to locate and update the code.

In `@src/fastcs/control_system.py`:
- Around line 118-121: The startup currently calls self._run_initial_coros()
unconditionally after await self._controller.reconnect(), which means if
reconnect() returns without raising but leaves the connection flag
(self._connected or self._controller.connected) False those ONCE coroutines will
run while disconnected and never be retried; change the logic so
_run_initial_coros() only runs when a successful connection exists (check
self._connected or self._controller.connected) and otherwise mark the initial
coroutines as pending (e.g., set an attribute like self._initial_coros_pending =
True) and invoke _run_initial_coros() from the reconnect-success path (or the
reconnect callback/handler) when connection becomes True, leaving
_start_scan_tasks() behavior unchanged or similarly gated if needed.

In `@src/fastcs/demo/controllers.py`:
- Around line 98-102: reconnect currently calls self.connection.close() and
self.connection.connect(...) without handling exceptions, violating the
Controller.reconnect() contract; wrap the close/connect calls in a try/except
that catches Exception, log the failure with a descriptive message and exception
info (use a module logger via logging.getLogger(__name__)), and ensure
self._connected is only set True on successful connect and left False on failure
(do not re-raise).

---

Nitpick comments:
In `@src/fastcs/controllers/controller.py`:
- Around line 55-57: The base async disconnect() leaves the controller's
connection flag unchanged; update Controller.disconnect() to set self._connected
= False to reliably clear connection state for subclasses that do not override
disconnect(), ensuring lifecycle semantics are correct (modify the async def
disconnect(self) method in the Controller class to assign self._connected =
False before returning).

In `@tests/assertable_controller.py`:
- Line 79: The test helper currently calls the private method
controller._build_api([]); replace that with the public factory
create_api_and_tasks() to avoid depending on a private symbol. Call
create_api_and_tasks() (or controller.create_api_and_tasks() if it is a method)
and unpack its return (e.g., api, tasks = create_api_and_tasks(...)) or assign
the returned API to controller_api so the helper uses the supported public API
surface instead of _build_api.

In `@tests/conftest.py`:
- Around line 50-52: The fixture controller_api currently calls the private
helper controller._build_api([]); change it to use the controller's public
factory method (e.g. create_api_and_tasks or another documented public
API-building method) if one exists so tests exercise the public contract instead
of private internals; locate the controller_api fixture and replace
controller._build_api([]) with the appropriate public call (passing equivalent
arguments) to construct the API and tasks.

In `@tests/test_controllers.py`:
- Around line 260-283: The test sets controller state directly and doesn't
properly await task cancellation; change the setup to call and await
controller.connect() (instead of setting controller._connected = True) so the
public API is exercised (see MyTestController, connect, reconnect, scan
decorator and scan_coros usage), and when cancelling the background task created
from scan_coros[0]() ensure you await it and handle asyncio.CancelledError to
cleanly swallow cancellation (i.e., cancel the task, await it inside a
try/except catching CancelledError) so the test teardown is robust.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 942faf23-2977-4afa-afe4-a32b3fe23c64

📥 Commits

Reviewing files that changed from the base of the PR and between 5fc0836 and 62d40ea.

📒 Files selected for processing (33)
  • src/fastcs/attributes/attr_r.py
  • src/fastcs/connections/ip_connection.py
  • src/fastcs/control_system.py
  • src/fastcs/controllers/__init__.py
  • src/fastcs/controllers/base_controller.py
  • src/fastcs/controllers/controller.py
  • src/fastcs/controllers/controller_api.py
  • src/fastcs/demo/controllers.py
  • src/fastcs/transports/__init__.py
  • src/fastcs/transports/controller_api.py
  • src/fastcs/transports/epics/ca/ioc.py
  • src/fastcs/transports/epics/ca/transport.py
  • src/fastcs/transports/epics/docs.py
  • src/fastcs/transports/epics/gui.py
  • src/fastcs/transports/epics/pva/ioc.py
  • src/fastcs/transports/epics/pva/pvi.py
  • src/fastcs/transports/epics/pva/transport.py
  • src/fastcs/transports/epics/util.py
  • src/fastcs/transports/graphql/graphql.py
  • src/fastcs/transports/graphql/transport.py
  • src/fastcs/transports/rest/rest.py
  • src/fastcs/transports/rest/transport.py
  • src/fastcs/transports/tango/dsr.py
  • src/fastcs/transports/tango/transport.py
  • src/fastcs/transports/transport.py
  • tests/assertable_controller.py
  • tests/conftest.py
  • tests/test_control_system.py
  • tests/test_controllers.py
  • tests/transports/epics/ca/test_gui.py
  • tests/transports/epics/ca/test_softioc.py
  • tests/transports/epics/pva/test_pva_gui.py
  • tests/transports/rest/test_rest.py
💤 Files with no reviewable changes (2)
  • src/fastcs/transports/init.py
  • src/fastcs/transports/controller_api.py

@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

❌ Patch coverage is 95.09804% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.90%. Comparing base (5fc0836) to head (be338db).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/fastcs/controllers/controller.py 91.30% 4 Missing ⚠️
src/fastcs/controllers/controller_api.py 95.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #332      +/-   ##
==========================================
+ Coverage   90.83%   90.90%   +0.07%     
==========================================
  Files          70       70              
  Lines        2574     2574              
==========================================
+ Hits         2338     2340       +2     
+ Misses        236      234       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Pause all scan tasks when an exception occurs in one
Add `Controller.reconnect` method to reocnnect and unpause scan tasks
Move creation of initial tasks and scan tasks into Controller
@GDYendell GDYendell force-pushed the reconnect-handling branch from 62d40ea to 07bcbc4 Compare March 5, 2026 08:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/fastcs/controllers/base_controller.py (1)

393-404: Clean API tree construction with recursive sub-controller handling.

The _build_api method correctly builds the ControllerAPI hierarchy, including proper path propagation to nested sub-controllers.

Consider using unpacking syntax for the path construction (per Ruff RUF005):

🔧 Minor style improvement
             sub_apis={
-                name: sub_controller._build_api(path + [name])  # noqa: SLF001
+                name: sub_controller._build_api([*path, name])  # noqa: SLF001
                 for name, sub_controller in self.sub_controllers.items()
             },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fastcs/controllers/base_controller.py` around lines 393 - 404, In
_build_api, replace the list concatenation used to build child paths (currently
using path + [name] inside the sub_apis comprehension) with list unpacking so
the child path is constructed as [*path, name]; update the comprehension where
sub_controller._build_api is called (referencing sub_controllers,
sub_controller, sub_apis) to use the unpacked path form to satisfy Ruff RUF005
and keep semantics identical.
src/fastcs/controllers/controller.py (1)

85-93: Refactor to use the public io_ref property instead of directly accessing the private _io_ref attribute.

The Attribute class (inherited by AttrR) provides a public io_ref property and has_io_ref() method. Replace the pattern match on the private _io_ref with logic that uses these public accessors:

Example refactor
for attribute in api.attributes.values():
    if isinstance(attribute, AttrR) and attribute.has_io_ref():
        io_ref = attribute.io_ref
        if isinstance(io_ref, AttributeIORef):
            update_period = io_ref.update_period
            if update_period is ONCE:
                initial_coros.append(attribute.bind_update_callback())
            elif update_period is not None:
                scan_dict[update_period].append(
                    attribute.bind_update_callback()
                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fastcs/controllers/controller.py` around lines 85 - 93, The pattern-match
is directly accessing the private _io_ref; update the loop to use the public
API: check isinstance(attribute, AttrR) and attribute.has_io_ref(), get io_ref =
attribute.io_ref, ensure isinstance(io_ref, AttributeIORef), then read
update_period = io_ref.update_period and branch as before (if update_period is
ONCE -> initial_coros.append(attribute.bind_update_callback()); elif
update_period is not None ->
scan_dict[update_period].append(attribute.bind_update_callback())). Keep
references to AttrR, has_io_ref(), io_ref, AttributeIORef, update_period, ONCE,
bind_update_callback(), initial_coros and scan_dict to locate and update the
code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_controllers.py`:
- Around line 282-283: After calling task.cancel(), replace the sleep-based
cleanup with explicitly awaiting the cancelled task to ensure proper teardown;
await the Task object named task and suppress/handle asyncio.CancelledError
(e.g., with a try/except around await task) so the cancelled coroutine is fully
awaited and avoids pending-task warnings in tests that use the task variable in
tests/test_controllers.py.

---

Nitpick comments:
In `@src/fastcs/controllers/base_controller.py`:
- Around line 393-404: In _build_api, replace the list concatenation used to
build child paths (currently using path + [name] inside the sub_apis
comprehension) with list unpacking so the child path is constructed as [*path,
name]; update the comprehension where sub_controller._build_api is called
(referencing sub_controllers, sub_controller, sub_apis) to use the unpacked path
form to satisfy Ruff RUF005 and keep semantics identical.

In `@src/fastcs/controllers/controller.py`:
- Around line 85-93: The pattern-match is directly accessing the private
_io_ref; update the loop to use the public API: check isinstance(attribute,
AttrR) and attribute.has_io_ref(), get io_ref = attribute.io_ref, ensure
isinstance(io_ref, AttributeIORef), then read update_period =
io_ref.update_period and branch as before (if update_period is ONCE ->
initial_coros.append(attribute.bind_update_callback()); elif update_period is
not None -> scan_dict[update_period].append(attribute.bind_update_callback())).
Keep references to AttrR, has_io_ref(), io_ref, AttributeIORef, update_period,
ONCE, bind_update_callback(), initial_coros and scan_dict to locate and update
the code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4018d294-4bb1-4a14-a4fb-b0d7bc0e8d1b

📥 Commits

Reviewing files that changed from the base of the PR and between 62d40ea and 07bcbc4.

📒 Files selected for processing (33)
  • src/fastcs/attributes/attr_r.py
  • src/fastcs/connections/ip_connection.py
  • src/fastcs/control_system.py
  • src/fastcs/controllers/__init__.py
  • src/fastcs/controllers/base_controller.py
  • src/fastcs/controllers/controller.py
  • src/fastcs/controllers/controller_api.py
  • src/fastcs/demo/controllers.py
  • src/fastcs/transports/__init__.py
  • src/fastcs/transports/controller_api.py
  • src/fastcs/transports/epics/ca/ioc.py
  • src/fastcs/transports/epics/ca/transport.py
  • src/fastcs/transports/epics/docs.py
  • src/fastcs/transports/epics/gui.py
  • src/fastcs/transports/epics/pva/ioc.py
  • src/fastcs/transports/epics/pva/pvi.py
  • src/fastcs/transports/epics/pva/transport.py
  • src/fastcs/transports/epics/util.py
  • src/fastcs/transports/graphql/graphql.py
  • src/fastcs/transports/graphql/transport.py
  • src/fastcs/transports/rest/rest.py
  • src/fastcs/transports/rest/transport.py
  • src/fastcs/transports/tango/dsr.py
  • src/fastcs/transports/tango/transport.py
  • src/fastcs/transports/transport.py
  • tests/assertable_controller.py
  • tests/conftest.py
  • tests/test_control_system.py
  • tests/test_controllers.py
  • tests/transports/epics/ca/test_gui.py
  • tests/transports/epics/ca/test_softioc.py
  • tests/transports/epics/pva/test_pva_gui.py
  • tests/transports/rest/test_rest.py
💤 Files with no reviewable changes (2)
  • src/fastcs/transports/init.py
  • src/fastcs/transports/controller_api.py
🚧 Files skipped from review as they are similar to previous changes (15)
  • src/fastcs/transports/graphql/graphql.py
  • src/fastcs/connections/ip_connection.py
  • tests/transports/epics/ca/test_softioc.py
  • src/fastcs/transports/rest/transport.py
  • tests/transports/rest/test_rest.py
  • tests/conftest.py
  • tests/transports/epics/pva/test_pva_gui.py
  • src/fastcs/transports/epics/docs.py
  • src/fastcs/demo/controllers.py
  • src/fastcs/transports/epics/ca/transport.py
  • src/fastcs/transports/epics/pva/transport.py
  • src/fastcs/transports/graphql/transport.py
  • src/fastcs/controllers/init.py
  • tests/assertable_controller.py
  • src/fastcs/transports/epics/pva/ioc.py

Copy link
Contributor

@shihab-dls shihab-dls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pointed out one coderabbit suggestion I think we want, and a suggestion for scan coros, the latter is up for debate, so I'm approving as I'm happy with this current iteration anyways!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/fastcs/controllers/controller.py (1)

119-135: The sleep(1) for log ordering is fragile.

Using await asyncio.sleep(1) at line 133 to ensure the error message appears after the exception log is timing-dependent and could be unreliable under load. Consider combining these into a single structured log statement or removing the artificial delay.

♻️ Suggested simplification
                 except Exception:
-                    logger.exception("Exception in scan task", period=period)
-                    self._connected = False
-
-                    await asyncio.sleep(1)  # Wait so this message appears last
-                    logger.error("Pausing scan tasks and waiting for reconnect")
+                    logger.exception(
+                        "Exception in scan task - pausing and waiting for reconnect",
+                        period=period,
+                    )
+                    self._connected = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fastcs/controllers/controller.py` around lines 119 - 135, In scan_coro(),
remove the fragile await asyncio.sleep(1) used to force log ordering and instead
emit a single clear log (or back-to-back logs) that include the exception
context; specifically, in the except block for the asyncio.gather call replace
the sleep-and-error sequence with a single logger call (or immediate
logger.exception followed by logger.error) that mentions period and reconnection
state, keep setting self._connected = False, and ensure you pass the exception
context (use logger.exception or include exc_info) so the stacktrace and the
"Pausing scan tasks and waiting for reconnect" message are logged together
without relying on time-based ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/fastcs/controllers/controller.py`:
- Around line 119-135: In scan_coro(), remove the fragile await asyncio.sleep(1)
used to force log ordering and instead emit a single clear log (or back-to-back
logs) that include the exception context; specifically, in the except block for
the asyncio.gather call replace the sleep-and-error sequence with a single
logger call (or immediate logger.exception followed by logger.error) that
mentions period and reconnection state, keep setting self._connected = False,
and ensure you pass the exception context (use logger.exception or include
exc_info) so the stacktrace and the "Pausing scan tasks and waiting for
reconnect" message are logged together without relying on time-based ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 212d87c3-c425-4096-8ca9-130126b6155b

📥 Commits

Reviewing files that changed from the base of the PR and between 07bcbc4 and 3cc47af.

📒 Files selected for processing (2)
  • src/fastcs/controllers/controller.py
  • tests/test_controllers.py

@GDYendell GDYendell force-pushed the reconnect-handling branch from 3cc47af to 02fb2f6 Compare March 9, 2026 10:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_controllers.py`:
- Around line 260-284: The test
test_scan_exception_sets_disconnected_and_reconnect_resumes currently reconnects
and immediately cancels the scan task so it never verifies a resumed scan;
modify the test to wait for a second scan attempt after calling
controller.reconnect() (i.e., sleep or await long enough to surpass the 1s
backoff in Controller.scan handling) before cancelling the task so that
failing_scan (the scan coroutine created via scan decorator) actually runs again
and you can assert that scanning resumed and _connected stayed True.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e85f22f8-3d11-497b-80d4-b6b3f3af4020

📥 Commits

Reviewing files that changed from the base of the PR and between 3cc47af and 02fb2f6.

📒 Files selected for processing (1)
  • tests/test_controllers.py

@GDYendell GDYendell force-pushed the reconnect-handling branch 2 times, most recently from 261931c to 29a6607 Compare March 9, 2026 11:21
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tests/test_controllers.py (1)

260-284: ⚠️ Potential issue | 🟠 Major

This still doesn't prove scans resumed after reconnect().

The scan loop backs off for 1 second after the first exception, but this test reconnects after 0.1s and cancels immediately. A regression where reconnect() only flips _connected without ever running another scan would still pass.

✅ Tighten the assertion to wait for a post-reconnect scan attempt
 `@pytest.mark.asyncio`
 async def test_scan_exception_sets_disconnected_and_reconnect_resumes():
     class MyTestController(Controller):
+        def __init__(self):
+            super().__init__()
+            self.scan_attempts = 0
+            self.resumed = asyncio.Event()
+
         `@scan`(0.01)
         async def failing_scan(self):
+            self.scan_attempts += 1
+            if self.scan_attempts >= 2:
+                self.resumed.set()
             raise RuntimeError("scan error")
 
     controller = MyTestController()
     controller.post_initialise()
     _, scan_coros, _ = controller.create_api_and_tasks()
@@
     await asyncio.sleep(0.1)
     assert not controller._connected
 
-    # Trigger reconnect - _connected resumes scan tasks
+    attempts_before_reconnect = controller.scan_attempts
     await controller.reconnect()
-    assert controller._connected
+    await asyncio.wait_for(controller.resumed.wait(), timeout=2)
+    assert controller._connected
+    assert controller.scan_attempts > attempts_before_reconnect
 
     task.cancel()
     with pytest.raises(asyncio.CancelledError):
         await task
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_controllers.py` around lines 260 - 284, The test
test_scan_exception_sets_disconnected_and_reconnect_resumes currently only flips
_connected and cancels the task too soon; modify it to assert that a scan
actually runs after reconnect by waiting for a post-reconnect scan attempt:
instrument the failing_scan (or the test) to record a second invocation (e.g.,
increment a counter or set an asyncio.Event) from the scan coroutine, then after
await controller.reconnect() wait for that counter/event (with a timeout) before
asserting controller._connected and cancelling the task, ensuring reconnect()
caused a new scan run; reference MyTestController, failing_scan,
controller.reconnect(), and the scan_coros/task to locate where to add the
counter/event and the post-reconnect wait.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/test_controllers.py`:
- Around line 260-284: The test
test_scan_exception_sets_disconnected_and_reconnect_resumes currently only flips
_connected and cancels the task too soon; modify it to assert that a scan
actually runs after reconnect by waiting for a post-reconnect scan attempt:
instrument the failing_scan (or the test) to record a second invocation (e.g.,
increment a counter or set an asyncio.Event) from the scan coroutine, then after
await controller.reconnect() wait for that counter/event (with a timeout) before
asserting controller._connected and cancelling the task, ensuring reconnect()
caused a new scan run; reference MyTestController, failing_scan,
controller.reconnect(), and the scan_coros/task to locate where to add the
counter/event and the post-reconnect wait.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c9036862-d040-40d1-a455-3729622014f1

📥 Commits

Reviewing files that changed from the base of the PR and between 02fb2f6 and 29a6607.

📒 Files selected for processing (3)
  • src/fastcs/connections/ip_connection.py
  • tests/test_controllers.py
  • tests/test_ip_connection.py

@GDYendell GDYendell force-pushed the reconnect-handling branch from 29a6607 to dd6944f Compare March 9, 2026 11:27
@GDYendell GDYendell force-pushed the reconnect-handling branch from dd6944f to be338db Compare March 9, 2026 11:34
@GDYendell GDYendell merged commit 4d61641 into main Mar 9, 2026
11 checks passed
@GDYendell GDYendell deleted the reconnect-handling branch March 9, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants