feat: async run_experiment via RunHandle + cancellation + status widget by hinderling · Pull Request #10 · pertzlab/faro

hinderling · 2026-05-15T14:08:05Z

Summary

Move the MDA feed loop onto a worker thread, expose live status through a RunHandle (psygnal Signal), and add a napari dock widget that mirrors + steers the current run. Replaces the synchronous-blocking run_experiment / continue_experiment API.

Draft: breaks the public API. Notebook updates required (see below) before merging. The async demo notebook included here is a test artifact — it must be removed before merge (see Demo notebook section).

Why

The controller's feed loop ran on the main thread, so:

napari froze for the duration of every run (no Qt-event processing).
run_experiment blocked the calling cell — no interactive monitoring / cancellation without Ctrl-C (which sometimes left device state half-set).
Status was opaque: "what timepoint are we on, are we lagging?" was unanswerable.
No clean way to cancel or pause a long run.

Moving the loop onto its own thread fixes all of these: napari is responsive by construction, the cell returns immediately, and cancellation / pause / live status become natural.

What changed

New: `faro/core/run_status.py`

RunStatus — immutable snapshot dataclass: state, current_event_index, current_fov, n_events_total, n_events_consumed, n_frames_received, started_at / finished_at, lag_ms, background_errors, fatal_error, …
RunHandle — owns the worker thread + cooperative cancel/pause events, carries the run's (sorted) event list. Methods: status(), wait(), cancel(), pause(), resume(), is_running(), is_paused(). Signal: statusChanged (psygnal) emitting the latest RunStatus.
RunState: pending → running ⇄ pausing/paused → done/error (cancelling on cancel).

`faro/core/controller.py`

Controller.runStarted = Signal(object) fires on each new run/continue carrying the fresh RunHandle.
run_experiment / continue_experiment spawn a worker thread and return the handle immediately; validation still runs synchronously on the caller. Events are sorted once and stashed on the handle so the widget renders them in execution order.
_run_worker centralises pre-flight setup and wraps the feed loop so failures land in handle.fatal_error instead of crashing the user.
_run_mda_with_events polls cancel_event and pause_event each iteration — pause halts feeding after the in-flight backpressure window drains; resume continues.
fix: the engine queue is recreated per run. A cancelled run aborts the engine mid-drain, leaving a stale STOP_EVENT behind; reusing the queue made the next run's engine consume that sentinel and stall after a few events ("stuck at 3/80").
fix: _bump_status_for_frame skips IMG_STIM snaps — a stim emission is the SLM-illuminated snap paired with its imaging frame; counting it double-updated lag/elapsed and drifted the frame count off the RTMEvent count.
napari preview: the controller no longer carries its own preview-layer machinery, and live mode no longer has to be manually disconnected before a run. napari-micromanager's own _NapariMDAHandler keeps routing frames into the preview layer throughout the run; the controller just stops continuous sequence acquisition once at MDA start to avoid a snap-buffer race. Notebooks can drop the old "break the CoreViewerLink before running" dance.

New: `faro/widgets/experiment_status.py`

ExperimentStatusWidget — a napari dock panel that mirrors and controls the current run:

State chip, legend (imaging / stim / ref).
Event strip — one cell per RTMEvent, color-coded by type, past=opaque / future=dimmed progress fill, current cell bordered. Scales to thousands of events.
FOV map — one dot per unique stage position, equal-aspect, visit-order path, active dot recolored to the current event type.
Stats — event N/M, elapsed, scheduled, lag (red > 5 s), remaining, errors.
Pause / Resume + Stop buttons.
Theme-adaptive (napari light/dark), auto-rebinds on every new run via runStarted.

Async/Qt fixes folded in

PYMM_SIGNALS_BACKEND=psygnal forced in faro/microscope/base.py — with a QApplication loaded, pymmcore-plus otherwise picks the Qt signal backend and queues frameReady to the main thread; if the main thread is blocked (handle.wait()), frames never reach the controller. Forcing psygnal keeps the data path direct/synchronous on the engine thread.
Widget connects statusChanged with thread="main" + drives psygnal.qt.start_emitting_from_queue() so worker-thread emits reach QWidgets safely.
uv.lock: bumped pymmcore-widgets past an upstream fix (_presets_widget crashing on an empty device label during MDA events).

BREAKING: notebook updates required

Before

ctrl.run_experiment(events, stim_mode="current")   # blocked here
ctrl.finish_experiment()

After — choose one:

(a) Blocking equivalent (smallest diff):

ctrl.run_experiment(events, stim_mode="current").wait()
ctrl.finish_experiment()

(b) Non-blocking, with status / cancel / pause:

handle = ctrl.run_experiment(events, stim_mode="current")
# other cells can run; handle.status() / handle.cancel() / handle.pause()
handle.wait()                  # block at the end if desired
ctrl.finish_experiment()

Optional napari widget:

from faro.widgets import ExperimentStatusWidget
viewer.window.add_dock_widget(ExperimentStatusWidget(ctrl), name="Experiment")

Demo notebook (test artifact — remove before merge)

experiments/02_demo_sim_optogenetic/demo_sim_optogenetic_napari_async.ipynb is included only to exercise this PR against the virtual-microscope optogenetic backend (async run, pause/resume, cancel/restart, the status widget, multi-FOV). It doubles as a worked example of what the migrated notebooks could look like. It should be deleted before this PR merges — the real deliverable is the API + widget, not this notebook.

What to check / test before merging

Every notebook in experiments/* that calls run_experiment / continue_experiment — migrate to .wait() or the non-blocking flow. Confirm none rely on the old blocking return.
Notebooks that manually tear down the napari live link / CoreViewerLink before a run — that workaround is no longer needed; verify removing it and that the preview layer keeps updating during the run.
tests/hardware/* — update for the new RunHandle return type; run on the Moench rig.
Multi-channel imaging: the widget's frame counter / strip cursor assume ~1 imaging frame per RTMEvent. For multi-channel plans n_frames_received outpaces the RTMEvent count — verify the strip/stats still read sensibly or gate the assumption.
continue_experiment + the widget: confirm the strip/map rebuild correctly for the appended events and the FOV map merges positions.
Headless / no-Qt runs (CI, non-microscope dev machine) — import faro stays Qt-free; .wait() path works without a QApplication.
Cancel-then-restart and pause/resume on real hardware (verified on the simulator; engine-abort semantics differ per device).
Bump the virtual-microscope lockfile pin — uv lock --upgrade-package virtual-microscope to pick up the fixes now on its default branch (JIT pre-warm; SimCameraDevice digital ROI / MDA-teardown fix). Without this the demo notebook's first ~4 s of frames stall and the napari Snap preview freezes after a run. Commit the uv.lock change separately (it is not async/widget code).

Related (separate repo)

Two virtual-microscope fixes were needed for the demo notebook and have already landed on its default branch (virtual-env):

JIT pre-warm — pre-warms the numba physics-step JIT before the RealtimeEngine starts; otherwise the first ~4 s of snaps stall behind a compile holding the sim lock, so frames arrive in a burst instead of paced.
SimCameraDevice digital ROI — implements real ROI cropping. It also fixes an MDA-teardown bug: the camera previously raised NotImplementedError from set_roi, which aborted MDARunner._finish_run before it emitted sequenceFinished; napari-micromanager then never cleared _mda_running, so the Snap preview silently stopped updating after a run.

These are not part of this PR — faro just needs the lockfile bump above to pick them up.

Verification

Exercised end-to-end against the virtual-microscope optogenetic backend (napari + napari-micromanager + the widget):

Live status flows worker → widget on the main thread (psygnal queued delivery); strip / FOV map / stats update in real time.
Cancel mid-run, then restart from the notebook — reaches steady state, no stall.
Pause halts feeding after the backpressure window drains; resume runs to completion.
Frame count tracks RTMEvents 1:1 for single-channel plans; stim snaps no longer double-count.
87 unit tests pass.

Compatibility notes

Headless / no Qt: works — psygnal delivers slots synchronously without Qt. Widget package is opt-in (import faro.widgets); import faro / import faro.core stay Qt-free.
MDA engines other than pymmcore-plus: no regression — the controller still talks to hardware exclusively through AbstractMicroscope.

Screenshot

Move the MDA feed loop onto a worker thread, expose live status through a RunHandle + psygnal Signal, and add a minimal napari widget that mirrors the current run. Breaking change: ctrl.run_experiment(events, ...) and ctrl.continue_experiment(...) now return a RunHandle immediately instead of blocking until the run is done. Existing notebooks that did `ctrl.run_experiment(events, ...)` must be updated to either `handle = ctrl.run_experiment(events, ...); handle.wait()` for the old blocking semantics, or to use the new non-blocking flow (poll handle.status(), subscribe to handle.statusChanged, call handle.cancel() to stop early). What's in this commit: - faro/core/run_status.py (new): * RunStatus -- immutable snapshot dataclass with state, event/FOV indices, frame count, lag_ms, error info. * RunHandle -- owns the worker thread + cooperative cancel event, exposes status()/wait()/cancel()/is_running() + a psygnal statusChanged signal that emits the latest RunStatus on each update. Subscribers on the main thread see queued-connection delivery via psygnal's Qt integration. - faro/core/controller.py: * Controller exposes a class-level runStarted = Signal(object). Fires on every new run/continue so widgets can re-bind. * run_experiment / continue_experiment spawn a worker thread, return the handle, emit runStarted. Validation still happens synchronously so a bad event list raises on the calling thread. * _run_worker centralises pre-flight setup (writer init -- including the potentially-slow zarr rmtree on overwrite -- and Analyzer construction) and wraps the feed loop in try/except so worker-side failures land in handle.fatal_error rather than crashing the user. * _run_mda_with_events accepts the handle, checks handle.cancel_event at each loop iteration and in the backpressure throttle, asks the engine to cancel the in-flight event when set, and emits status updates on each RTMEvent dequeue. * _on_frame_ready (and ControllerSimulated._on_frame_ready) call a shared _bump_status_for_frame helper that increments n_frames_received and computes lag_ms vs event.min_start_time. * Now off the main thread, all the prior Qt-pumping helpers (_pump_qt_and_sleep, _qt_join, _wait_for_frame_pumping_qt) and the superqt ensure_main_thread import are obsolete and removed. The preview-layer machinery (viewer=, _on_preview_frame, _apply_preview, PREVIEW_LAYER_NAME) is also removed -- napari-micromanager's own _NapariMDAHandler already routes generator events into the preview layer. * finish_experiment now waits for the current handle before shutting down the Analyzer. * _pending_sentinels guarded by a Lock since extend_experiment now runs on the calling thread while the feed loop runs on the worker. - faro/widgets/experiment_status.py (new): * ExperimentStatusWidget -- read-out of state, FOV, event index, frame count, lag, elapsed time, error count. Has a Stop button that calls handle.cancel(). Subscribes to controller.runStarted so it automatically re-binds when a new run begins; cleans up the previous handle's signal subscription on each rebind. Verified end-to-end via a Qt smoke test: - Live updates flow from the worker thread to the widget on the main thread (psygnal+Qt queued delivery). - Stop button triggers handle.cancel(); the worker's cancel-check fires within one iteration and the run exits at the next event boundary. - Starting a new run re-binds the widget to the new handle and resets the progress bar / counters.

The OmeZarrWriter init in _run_worker still pulled image height/width via self._mic.mmc.getImageHeight/Width -- a pymmcore-plus-specific call that breaks any non-pymmcore microscope. Use the AbstractMicroscope-level convention: subclasses populate self.image_height / self.image_width on the microscope instance (Moench already does this in init_scope). Fall back to mmc if the attributes aren't present but mmc is, so existing pymmcore-only microscopes keep working without code changes. Raise a clear error when neither path is available.

Three independent bugs surfaced when running the new async run_experiment + ExperimentStatusWidget against a napari viewer (reproduced with the optogenetic virtual_microscope backend): 1. pymmcore-plus's signals_backend() auto-selects the *qt* backend whenever a QApplication is loaded. core.mda.events.frameReady then becomes a QtCore.SignalInstance and cross-thread emits land in Qt.QueuedConnection, where they're delivered only when the main thread pumps events. With Controller.run_experiment now spawning a worker and RunHandle.wait() joining on it, the main thread is typically idle-blocked exactly when the engine is firing frames -- so the controller's _on_frame_ready never ran, the engine completed "successfully" with zero frames received, and the pipeline never saw any data. Force PYMM_SIGNALS_BACKEND=psygnal in faro/microscope/base.py so the data path stays direct/synchronous on the engine thread regardless of whether Qt is loaded. The widget-side path (RunHandle.statusChanged) still uses psygnal's own queued delivery -- see fix #2. 2. ExperimentStatusWidget connected handle.statusChanged with the default (direct) connection. Status updates emitted from the worker thread therefore ran the widget's _refresh slot synchronously off-main, calling QLabel.setText / QProgressBar.setValue from a non-GUI thread. Under napari that lands in vispy's OpenGL compositor and aborts with "Cannot make QOpenGLContext current in a different thread" -> SIGABRT (kernel hard-crash in VSCode Jupyter). Switch to connect(..., thread="main") so psygnal queues the call into its main-thread queue. 3. psygnal's queued callbacks live in QueuedCallback._GLOBAL_QUEUE, which nothing drains by default -- the widget would be invoked on the main thread, but only when something explicitly calls psygnal.emit_queued(). RunHandle's docstring claims auto-Qt delivery; that's not how psygnal actually works. Call psygnal.qt.start_emitting_from_queue() in the widget's __init__, which installs a main-thread QTimer that fires emit_queued() on every Qt event-loop tick. Idempotent and global, so multiple widgets / multiple runs are safe. Lockfile: bump pymmcore-widgets (8c8f76e -> 48ff414) so the unrelated upstream crash in pymmcore_widgets._presets_widget._on_property_changed when handed an empty device label (virtual_microscope's shutter) is included. Without that bump, the MDA engine itself aborts on the first setShutterOpen() once frames actually start flowing. Verified end-to-end against virtual_microscope's optogenetic backend: - headless async run: 5/5 frames (regression check, unchanged) - napari.Viewer() + handle.wait(): 5/5 frames (was 0/5) - napari + napari-micromanager + widget: 5/5 frames, no crash, exit 0 - widget visibly updates progress / frames / state mid-experiment (sampled QLabel.text() while pumping Qt events) - 87 unit tests still pass

Sibling of demo_sim_optogenetic.ipynb that exercises the new async run_experiment + RunHandle + ExperimentStatusWidget end-to-end against virtual_microscope's optogenetic backend, with a live napari viewer dock-attached. Walks through: handle = ctrl.run_experiment(...) is non-blocking, the kernel is free; poll handle.status() while it runs; subscribe to handle.statusChanged from the kernel side; cancel via the widget Stop button or handle.cancel(); handle.wait() blocks if you want the old synchronous semantics; continue_experiment() re-binds the widget automatically via runStarted. Phases are concatenated with combine(..., axis="t") per the new RTMSequence API.

Backend changes that make an async run inspectable and steerable -- the data the new ExperimentStatusWidget renders, plus two bug fixes surfaced while building it. run_status.py - RunHandle.events: optional snapshot of the (sorted) RTMEvents the handle is driving, so widgets can render per-event visualisations (event strip, FOV map) that need the full plan up front. - Pause/resume: RunState gains "pausing"/"paused"; RunHandle gains pause()/resume()/is_paused() and a pause_event the feed loop polls. cancel() now also clears the pause event so a cancel while paused still releases the feed loop. controller.py - run_experiment / continue_experiment sort events once (by min_start_time, then position) and stash the sorted list on the handle, so the order the worker processes matches what the widget displays. - Feed loop honors pause_event: before pulling the next RTMEvent it checks the flag, flips state to "paused", and idles until resume() -- the MDA engine drains whatever is already queued, then waits. - fix: the engine queue (self._queue) is recreated per run. The finally-block feeds a STOP_EVENT sentinel to stop the engine; on a *cancelled* run cancel_mda() aborts the engine, which may stop without draining the queue, leaving stale events + the sentinel behind. Reusing that queue made the next run's engine consume the stale sentinel and exit after a few events ("stuck at 3/80"). A fresh queue per run fixes it. - fix: _bump_status_for_frame skips IMG_STIM frames. A stim emission is the SLM-illuminated snap paired with its imaging frame; counting it double-updated the status (lag/elapsed refreshing twice per stim event) and made n_frames_received drift away from the RTMEvent count. Imaging + ref frames are the meaningful data frames. Verified end-to-end against the optogenetic virtual-microscope backend: cancel mid-run then restart reaches steady state (no stall); pause halts feeding after the backpressure window drains and resume continues to completion; frame count tracks RTMEvents 1:1 for single-channel plans.

Rework the minimal status widget into a full run dashboard, driven by the RunHandle data exposed in the previous commit. Components (top to bottom): - State chip -- RUNNING / PAUSED / DONE / ... as plain text in a translucent-neutral rounded chip (no per-state fill: a colored banner competed with the imaging/stim/ref legend colors). - Legend chips -- imaging / stim / ref; the chip matching the current event type is fully opaque, the others dimmed. - EventStrip -- one cell per RTMEvent, color-coded by type. Past + current cells opaque (progress fill), future cells dimmed. Same-type runs are coalesced into single fills so thousands of events render with correct alpha instead of over-stacking at sub-pixel widths. Empty state draws a "(no events loaded)" placeholder. - FovMap -- one dot per unique FOV position, equal-aspect (a straight line of FOVs stays a line), grey visit-order path, active dot recolored to the current event type. Pinned square via resizeEvent. Paints its own rounded panel background; "FOV X/Y" counter in the corner. - Stats form -- event N/M, elapsed, scheduled, lag, remaining, errors. Times formatted hh:mm:ss with the leading unit suffixed and dropped when zero; lag turns red past 5 s. Wrapped in a shaded panel echoing napari's layer-controls boxes. - Pause/Resume + Stop buttons. Threading / theming details: - statusChanged is connected with thread="main" and the widget calls psygnal.qt.start_emitting_from_queue() so worker-thread emits are delivered on the GUI thread (drives QWidgets safely under napari). - A 250 ms QTimer ticks the elapsed/remaining clocks between status emissions so time fields don't freeze between frames. - The strip cursor tracks n_frames_received (actual snaps), not n_events_consumed (the feed loop runs 3-4 ahead via backpressure, which made the strip jump several cells at run start). - Colors/fonts derive from the Qt palette so the widget adapts to napari's light/dark theme; corner radii match napari widgets.

Add a second stage position (20, 20, 0) to the baseline / stim / recovery sequences so the demo exercises a 2-FOV acquisition -- the ExperimentStatusWidget's FOV map then shows both positions and the visit-order path between them. Drop the frame interval 1.5s -> 1s.

hinderling · 2026-05-16T11:38:39Z

@alandolt can you have a look if you see any general issues with this architecture change? still a few open TODOs before merging, but the main idea is there i think! but would be great to have your input before i start migrating the other notebooks etc. I think this will also be useful more long-term, running experiments on different microscopes simultaneously with BO for example, in combo with pymmcore-proxy.

Add FrameDispenser.cancel() and the FrameWaitCancelled exception so a thread blocked in wait_for_frame / get_predecessor is woken immediately instead of sitting out the full timeout. This lets an experiment abort promptly: a feed loop parked in an up-to-80s stim-mask wait is released the instant the run is cancelled.

Cancellation: RunHandle gains an on_cancel hook, invoked synchronously from cancel(), that wakes a feed loop blocked in a stim-mask wait via Analyzer.cancel_pending_waits(). Previously a cancel issued during that wait took up to the stim-mask timeout (~80s) to take effect, leaving the frame handler connected in the meantime. Queue stats: Analyzer.queue_stats() / Controller.queue_stats() expose storage, pipeline and deferred queue depths for the status widget. finish_experiment runs its teardown (run wait + Analyzer drain) on a worker thread and pumps Qt, so napari stays responsive during the drain. Lag is anchored to the first frame's acquisition start rather than the worker's start time, so worker/engine startup (~1s) is no longer charged to every lag reading.

Stop now cancels the run and then runs finish_experiment(), so the next run starts clean instead of leaking the old Analyzer; the state banner shows STOPPING... while the drain runs. Stats are split into three panels (timing / queues / errors). The storage and pipeline queue depths render as grayscale fill bars that turn red past 80% of capacity; deferred shows as a plain count. The FovMap is freely resizable instead of pinned square.

alandolt · 2026-05-19T09:30:31Z

looks super cool and well executed. Thanks.
After a first glance through the code I don't see any issue, will probably soon push forward to also expand controller by an update method that replaces the old stored acquisition events (as seen here https://github.com/pertzlab/faro/blob/main/faro/core/controller.py), as for my agent stuff this is the way to go for some agent classes.
Will try it out on the real mic tomorrow.

Empty the backpressure window (~3 queued MDAEvents) into a held buffer when the user pauses, refilling on resume. On sparse experiments the queued events would otherwise keep snapping for minutes after Pause. min_start_times are not shifted -- late events catch up on resume.

WaitEvent is an RTMEvent that emits no MDAEvents; wait(duration_s) builds one and combine() treats it as a pure time marker -- it extends wall-clock for subsequent phases but claims no t/p index. The feed loop waits for the engine to catch up, then counts down to max(scheduled_start, now) + duration_s so a pause-drain can't eat the wait window. Adds a "waiting" RunState + wait_remaining_s, and anchors started_at to the first acquired frame so a leading wait doesn't tick elapsed/lag. Demo notebook brackets the stim phase with wait(5).

Show a "WAITING hh:mm:ss" countdown banner, keep the strip cursor on the wait cell during the countdown, and draw wait cells as solid gray (hatch overlaid only when wide enough to read). Pause->Resume flips immediately during a wait by reading pause_event rather than the run state. Also remove the 1px inter-phase gap (runs span to the next run's start; active border widened to match) and dim inactive legend chips.

Pin the invariants: pause/resume changes when frames are acquired, not what -- a paused run yields byte-identical OME-Zarr to an unpaused one, same frame count, clean cancel-during-pause. WaitEvents claim no t/p index, emit no MDAEvents (add time, not frames), and shift subsequent min_start_times by at least their duration. Pause is driven by polling status() with min_start_time-spaced events so the tests are deterministic.

tests/fixtures.py imports TrackerMotile at module load, so test collection fails without motile installed.

motile is the non-default tracking backend (a runtime dep), not a test tool, so list it as a feature extra alongside the other backends and have the test extra pull it in via faro[motile].

A WaitEvent carries no channels and no metadata, so events_to_dataframe emitted a bogus zero-channel imaging row for it and validate_pipeline falsely flagged it as missing required metadata. Skip WaitEvents in both (they are timed gaps, not acquired frames). Add regression tests, including that validate_hardware already tolerates them.

Niesen runs a WakeUpLaser keepalive thread and holds a DMD/SLM handle but had no shutdown() override, so pymmcore native threads could keep the process alive as a zombie that blocks the next session. Mirror Moench.shutdown: stop the keepalive thread and unloadAllDevices.

Re-run outputs for the wait-bracketed experiment (no source changes).

run_experiment/continue_experiment are non-blocking and return a RunHandle; update every example to call .wait() and add a usage-level note covering pause/resume/cancel and the ExperimentStatusWidget, plus a "Timed waits" note for wait() between phases via combine.

run_experiment is now non-blocking; these notebooks read results or call post_experiment right after, so add .wait(). The demo_sim_optogenetic runs had post_experiment() between run and finish, which raced. Edit-only; outputs not refreshed.

Split each run cell into an async launch (dock ExperimentStatusWidget, return handle) and a finalize cell (handle.wait() before post_experiment and result reads). Drop the post-run time.sleep(10/90) plumbing waits, now covered by handle.wait() + finish/post drain. Edit-only.

handle.wait() only joins the run worker, not the analyzer that writes tracks. The deleted time.sleep() was the crude analyzer-drain wait; replace it with ctrl.finish_experiment() (waits for the run + drains the pipeline) before generate_exp_data_from_tracks / parquet reads.

stim_rtmsequence: dock ExperimentStatusWidget; its existing finish_experiment() already waits for the async run + drains. stim_dfacquire: serialize the two-controller phases with handle.wait() so they don't run concurrently on one microscope, and dock the widget. (Pre-existing bug left untouched: cell-12/cell-16 are mis-tagged as markdown, so df_acquire_2 + post-processing don't execute.)

Six sequential phases driven by run_experiment + chained continue_experiment on one controller, with manual drug pipetting at each cell boundary. Add .wait() after every run/continue so a phase finishes before the next continue_experiment (which would otherwise raise "already running") and before the napari reconnect/pipetting pause. Preserves the original blocking semantics; dock the status widget once (re-binds on each phase via runStarted).

cell-12 (builds df_acquire_2) and cell-16 (post-processing) were tagged as markdown, so they never executed -- cell-15 referenced an undefined df_acquire_2 and results were never written. Convert both to code. The post-processing cell also gets the async treatment: wait on phase 2, drain via finish_experiment (replacing the dropped sleep), then read.

napari-micromanager keeps routing frames into its preview layer during a run (the controller only stops continuous acquisition once at MDA start), so the manual mm_wdg._core_link.cleanup() before each run + CoreViewerLink reconnect after is no longer needed. Remove it from run/phase cells so the live link stays connected throughout. DMD-calibration cells (which drive the camera directly, outside an MDA run) and the manual reconnect/break utility cells are left as-is.

uv lock --upgrade-package virtual-microscope: e2aca8da -> bd4ac3e3 to pick up the JIT pre-warm + SimCameraDevice digital-ROI/MDA-teardown fixes on its default branch. Also syncs the motile extra into the lock (added to pyproject earlier but not yet relocked).

calibrate_dmd gains background=True (default): it runs dmd.calibrate on a DMDCalibration worker thread while pumping Qt in the caller, so napari keeps updating the live preview during calibration instead of freezing. Adds a module-level _pump_qt_events helper; background=False runs the calibration synchronously as before. Also fix dmd.py's frameReady.disconnect() calls: the calibration's frame-collection blocks now disconnect their own named handlers (frameReady.disconnect(handler)) instead of the no-arg disconnect, which also tore down napari-micromanager's preview listener and left the preview dead after calibration.

tests/hardware/test_cell_migration.py and test_line_stimulation.py were left behind when the hardware suite moved into tests/hardware/pertzlab/. They're superseded by the pertzlab copies (line_stimulation is identical bar the import; cell_migration's pertzlab version adds the shared cellpose fixture + a timestep-ordering check). Running both trees in one pytest session spun up a second session-scoped Moench, which then failed to open the DMD ("Mosaic3: No Mosaic3 devices found") because the first microscope already held it. Drop the duplicates and their now-orphaned tests/hardware/conftest.py.

Scale the lag stress test down: N_FRAMES 12->6, interval 5->2 s, pipeline delay 7->3 s. The lag invariant is the delay/interval ratio (interval < delay < 2*interval), preserved here, so coverage is identical -- the pipeline still lags ~1.5 frames and every stim mask must still land without a dispenser skip or deadlock. Verified green on the Moench.

TIME_BETWEEN_TIMESTEPS_S dominated wall-clock on every acquisition test (4 frames x 5 s = 20 s scheduled per test). Drop it to 2 s -- the 3-FOV acquisition fits comfortably and the tests assert correctness (which frames stim, masks present, no background errors), not the interval, so coverage is unchanged. Also drop stim_mask_timeout's pipeline delay 10 s -> 3 s (still above its 1 s timeout) to speed the post-run drain. Verified on the Moench: pertzlab suite 14 passed / 1 skipped, 4:15 -> 2:50. The cellpose + empty-fov tests got the same mechanical interval change but were not run here (cellpose extra not installed; empty-fov skips without .preflight.json).

apply_fov_batching gains offset_min_start_time (default True). FOVs in a batch are imaged sequentially, not simultaneously, so the k-th FOV of a batch only starts ~k * time_per_fov after the batch's first FOV. Encoding that offset into each event's min_start_time keeps the scheduled per-FOV frame interval consistent and makes lag measurement meaningful (lag is acquisition-start minus min_start_time). The first FOV of every batch gets a 0 within-batch offset; batches after the first still get their batch wall-clock offset on top.

statusChanged is delivered cross-thread (worker -> Qt main) via psygnal's queued emission, which proved unreliable in some embeddings (notably Jupyter notebooks): the event strip / FOV cursor / counters stayed frozen even though handle.status() was current and Stop still worked. The _tick QTimer already polled the time + queue fields between emissions; make it do a full _refresh from the latest status snapshot so the whole widget stays live regardless of signal delivery. The statusChanged connection stays as a push optimisation; _refresh's repaints are change-guarded, so a full refresh at the 250 ms tick is cheap.

pipeline.run() names the per-phase tracks file {fov}_phase_{phase_id}_latest.parquet whenever phase metadata is present and reads metadata['phase_id'] directly. The guard keyed off "phase_id in metadata OR phase_name in metadata", so supplying phase_name without phase_id passed the guard and then KeyError'd mid-run, after acquisition had already started. - validate_pipeline now flags any event carrying phase_name but no phase_id, so validate_events fails up front with a clear message. - run() keys the per-phase filename off phase_id alone, so the combination can no longer crash even when validation is skipped.

pipeline_post.run() (the deferred/reprocess-from-disk path) had the same "phase_id in metadata or phase_name in metadata" guard followed by a direct metadata['phase_id'] access as pipeline.run(). Key it off phase_id alone so a phase_name-without-phase_id event can't KeyError on the deferred path either.

A WaitEvent carries an explicit gap (duration_s); _combine_pair was adding the inferred inter-source interval on top of it, double-counting the wait. The feed loop sleeps for duration_s AND the next source's min_start_time included the wait *plus* an interval -- so combine(wait(10), phase) at a 10 s interval started the first acquisition at t=20 (10 s wait countdown + 10 s engine pacing) instead of t=10. Skip _infer_interval when either side of the merge boundary is a WaitEvent. combine(wait(10), phase) now starts the first frame at t=10, and combine(a, wait(10), b) starts b exactly duration_s after a's last frame (verified: [0,10,20] + wait(10) -> [30,40,50]).

hinderling mentioned this pull request May 16, 2026

fix: stop live mode + pump Qt event loop during run_experiment #9

Closed

3 tasks

hinderling and others added 7 commits May 16, 2026 11:35

hinderling force-pushed the feat/async-run-handle branch from d473b9b to 3c0e798 Compare May 16, 2026 09:53

hinderling marked this pull request as ready for review May 16, 2026 11:29

hinderling added 3 commits May 19, 2026 09:52

hinderling added 15 commits May 21, 2026 12:58

Add motile to the test extra

bc6c0b2

tests/fixtures.py imports TrackerMotile at module load, so test collection fails without motile installed.

Move motile into its own extra, referenced by test

c367a2c

motile is the non-default tracking backend (a runtime dep), not a test tool, so list it as a feature extra alongside the other backends and have the test extra pull it in via faro[motile].

Refresh async optogenetic demo notebook outputs

30b6b88

Re-run outputs for the wait-bracketed experiment (no source changes).

hinderling and others added 12 commits May 28, 2026 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: async run_experiment via RunHandle + cancellation + status widget#10

feat: async run_experiment via RunHandle + cancellation + status widget#10
hinderling wants to merge 37 commits into
pertzlab:mainfrom
hinderling:feat/async-run-handle

hinderling commented May 15, 2026 •

edited

Loading

Uh oh!

hinderling commented May 16, 2026

Uh oh!

alandolt commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hinderling commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What changed

New: faro/core/run_status.py

faro/core/controller.py

New: faro/widgets/experiment_status.py

Async/Qt fixes folded in

BREAKING: notebook updates required

Before

After — choose one:

Demo notebook (test artifact — remove before merge)

What to check / test before merging

Related (separate repo)

Verification

Compatibility notes

Screenshot

Uh oh!

hinderling commented May 16, 2026

Uh oh!

alandolt commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hinderling commented May 15, 2026 •

edited

Loading

New: `faro/core/run_status.py`

`faro/core/controller.py`

New: `faro/widgets/experiment_status.py`