Threading Utils (and fix for Native Module flakey test) by jeff-hykin · Pull Request #1663 · dimensionalOS/dimos

jeff-hykin · 2026-03-25T07:03:34Z

Problem

Flakey test, and a history of Flakey tests surrounding threads in modules.

Solution

Smart thread tooling with auto-cleanup to reduce the risk of bad cleanup and reduce bloat.

Breaking Changes

None, did more testing than usual because this touches core.

How to Test

# Thread utility tests (deadlocks, races, stress)
python -m pytest dimos/utils/test_thread_utils.py -v --noconftest

# NativeModule + MCP server tests
python -m pytest dimos/core/test_native_module.py dimos/agents/mcp/ -v --noconftest

# Full suite
python -m pytest --timeout=120 -q --ignore=dimos/perception/detection/type

Contributor License Agreement

I have read and approved the CLA.

greptile-apps · 2026-03-25T07:07:25Z

Greptile Summary

This PR replaces ad-hoc threading and event-loop management spread across ModuleBase, NativeModule, and McpServer with three new composable utilities in thread_utils.py: ThreadSafeVal (reentrant-lock value wrapper), ModuleThread/AsyncModuleThread (self-registering threads with auto-cleanup via module disposables), and ModuleProcess (managed subprocess with watchdog and SIGTERM→SIGKILL escalation). The refactor removes ~150 lines of duplicated boilerplate, fixes the root cause of the flakey-thread tests (non-reentrant locks inside context managers), and adds an explicit ModState lifecycle to prevent restarts after stop.\n\nKey changes:\n- ThreadSafeVal uses RLock to prevent deadlocks when set() is called inside a with block — this was the confirmed root cause of flakey tests\n- AsyncModuleThread and ModuleProcess self-register Disposable(self.stop) in module._disposables on construction, so teardown is automatic\n- NativeModule is reduced to ~35 lines of business logic; all subprocess/watchdog plumbing moved to ModuleProcess\n- ModuleBase.start() now sets state to \"started\" (guarded against re-entry from \"stopped\") and stop() is idempotent\n- All _close_module() call sites updated to use the public stop() API\n\nIssues found:\n- test_mcp_server_lifecycle calls server._start_server(port=port) directly without first starting the async thread; _async_thread.loop is None at that point, so asyncio.run_coroutine_threadsafe will raise AttributeError before the retry loop runs\n- AsyncModuleThread.loop is typed as non-optional (asyncio.AbstractEventLoop) but returns None until start() is called\n- After __setstate__, _async_thread is None; a subsequent start() call will raise AttributeError: 'NoneType' object has no attribute 'start' rather than a helpful error message

Confidence Score: 4/5

Core threading logic is solid and well-tested; one new integration test has a definite bug that will fail on CI, plus two minor defensive-programming gaps — none affect the production code path

The fundamental fix (RLock, auto-cleanup disposables) is correct and thoroughly tested. The three issues found are: one failing test (P1, easy one-liner fix), one misleading type annotation (P2), and one unhelpful error message path for a rare deserialization edge case (P2). None of these affect the production start()/stop() lifecycle or the actual threading correctness. Prior review concerns are either resolved or explicitly addressed by the developer. Score 4 rather than 5 because the new integration test will fail as written.

dimos/agents/mcp/test_mcp_server.py (test_mcp_server_lifecycle will crash before the retry loop), dimos/utils/thread_utils.py (AsyncModuleThread.loop type annotation)

Important Files Changed

Filename	Overview
dimos/utils/thread_utils.py	New file introducing ThreadSafeVal (RLock-based wrapper), ModuleThread, AsyncModuleThread, ModuleProcess, and safe_thread_map — core building blocks of the threading refactor; the loop property returns None before start() but is typed as non-optional
dimos/core/module.py	Replaces manual loop/thread management with AsyncModuleThread and adds ModState lifecycle tracking; setstate sets _async_thread=None which will cause AttributeError if start() is called on a deserialized module
dimos/core/native_module.py	Removes ~120 lines of manual subprocess/thread management, delegating entirely to ModuleProcess; clean simplification with no new issues
dimos/agents/mcp/test_mcp_server.py	New lifecycle integration test calls _start_server() directly without starting the async thread first — _async_thread.loop is None at that point, causing an AttributeError before any retry logic runs
dimos/utils/test_thread_utils.py	Comprehensive new test suite covering deadlocks, race conditions, idempotency, and stress scenarios for all new thread utilities; well-structured with good edge-case coverage
dimos/utils/typing_utils.py	Adds ExceptionGroup polyfill for Python < 3.11 and consolidates TypeVar compatibility shim; clean utility addition
dimos/agents/mcp/mcp_server.py	Removes stale self._loop references in favour of self._async_thread.loop; the removed assert loop is not None was correct to remove since the loop guard moved to stop()
dimos/core/test_native_module.py	Updated assertions to use new _proc attribute and is_alive property; adds idempotent stop() call; correctly reflects the new API

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ModuleBase
    participant AsyncModuleThread
    participant ModuleProcess
    participant CompositeDisposable

    Caller->>ModuleBase: __init__()
    ModuleBase->>CompositeDisposable: create _disposables
    ModuleBase->>AsyncModuleThread: AsyncModuleThread(module=self)
    AsyncModuleThread->>CompositeDisposable: add(Disposable(self.stop))
    note over AsyncModuleThread: _loop = None

    Caller->>ModuleBase: start()
    ModuleBase->>ModuleBase: mod_state → "started"
    ModuleBase->>AsyncModuleThread: start()
    AsyncModuleThread->>AsyncModuleThread: create event loop + daemon thread
    note over AsyncModuleThread: _loop = running loop

    Caller->>ModuleBase: (NativeModule) start()
    ModuleBase->>ModuleProcess: ModuleProcess(module=self, ...)
    ModuleProcess->>CompositeDisposable: add(Disposable(self.stop))
    ModuleProcess->>ModuleProcess: start() → Popen + watchdog ModuleThread

    Caller->>ModuleBase: stop()
    ModuleBase->>ModuleBase: mod_state → "stopped"
    ModuleBase->>CompositeDisposable: dispose()
    CompositeDisposable->>AsyncModuleThread: stop() → loop.stop() + join
    CompositeDisposable->>ModuleProcess: stop() → SIGTERM/SIGKILL + join
    ModuleBase->>ModuleBase: rpc.stop(), _tf.stop()

Comments Outside Diff (2)

dimos/agents/mcp/test_mcp_server.py, line 63 (link)

_start_server called before async thread is started

_start_server is called directly here without first calling server.start() (or at minimum server._async_thread.start()). As a result self._async_thread.loop returns None (its initial value from AsyncModuleThread.__init__), and asyncio.run_coroutine_threadsafe(server.serve(), None) will raise AttributeError: 'NoneType' object has no attribute 'call_soon_threadsafe', crashing the test before the retry loop is ever reached.

In the normal lifecycle McpServer.start() calls super().start() → self._async_thread.start() first, which populates the loop, and then calls _start_server(). The test bypasses that sequence to supply a custom port, but needs to replicate it:

or call the public API and configure the port separately.
dimos/core/module.py, line 163-169 (link)

_async_thread = None after unpickling breaks subsequent start() calls

After __setstate__, _async_thread is set to None. Any code path that later calls start() will attempt self._async_thread.start() and raise AttributeError: 'NoneType' object has no attribute 'start'.

The previous implementation stored _loop = None after unpickling, but _loop was only accessed inside _close_module with a getattr(self, "_loop", None) null-guard. There is no equivalent guard for _async_thread.

If restarting a deserialized module is intentionally unsupported, a clear guard in start() would make that explicit:
```
if self._async_thread is None:
    raise RuntimeError(
        f"{type(self).__name__} was deserialized and cannot be restarted; "
        "reconstruct the module instead."
    )
```
If restart after deserialization should be supported, _async_thread needs to be reconstructed in __setstate__ similarly to how _disposables is.

_{Reviews (2): Last reviewed commit: "CI code cleanup" | Re-trigger Greptile}

greptile-apps · 2026-03-25T07:07:28Z

dimos/utils/test_thread_utils.py

+                assert done.wait(timeout=10), "Deadlock with slow ModuleThread.stop()"
+
+
+from dimos.utils.typing_utils import ExceptionGroup


ExceptionGroup imported at bottom of file, used earlier

ExceptionGroup is imported on line 888 but first used on line 750 inside TestSafeThreadMap methods. This works at runtime because the full module is loaded before any test runs, but it's confusing to readers: the symbol appears to be undefined at its use sites, and any linter or static analysis tool will flag these as NameErrors. The import should be moved to the top-level imports block alongside the other third-party imports.

Suggested change

from dimos.utils.typing_utils import ExceptionGroup

from dimos.utils.typing_utils import ExceptionGroup

(Move this to the top of the file alongside the other dimos.utils imports, and remove line 888.)

greptile-apps · 2026-03-25T07:07:30Z

dimos/utils/test_thread_utils.py

+    """
+
+    @staticmethod
+    def _make_fake_stop(mod: FakeModule, done: threading.Event) -> Callable:


Missing Callable import used in return-type annotation

Callable is referenced as a return-type annotation in _make_fake_stop but is never imported in this file. With from __future__ import annotations in effect, the annotation is stored as a string at definition time and won't raise a NameError at runtime. However, any call to typing.get_type_hints(_make_fake_stop) — including some test introspection tools — will fail with NameError: name 'Callable' is not defined.

Add to the imports at the top of the file:

from collections.abc import Callable

dimos/agents/mcp/mcp_server.py

greptile-apps · 2026-03-25T07:07:31Z

dimos/utils/thread_utils.py

+        self._watchdog = ModuleThread(
+            module=self._module,
+            target=self._watch,
+            name=f"proc-{self._process.pid}-watchdog",
+        )


Each ModuleProcess.start() call adds a new ModuleThread disposable

Every time start() is called (line 388), a new ModuleThread is constructed for the watchdog. ModuleThread.__init__ immediately registers a Disposable(self.stop) in module._disposables (line 155). CompositeDisposable simply appends, so restarting the process accumulates stale disposables for watchdog threads that have already exited.

For the single-use lifecycle this is fine. But if start() is ever called more than once (e.g. after a failed first attempt, or the deferred-start path), the module's disposable list grows unboundedly, and on teardown each old watchdog's stop() is called even though it already finished, which — while idempotent — is surprising and hard to debug.

Consider either:

Explicitly removing the old watchdog disposable before creating a new one, or

Documenting clearly that start() is a one-shot operation and raising an error on re-entry.

super().start() will throw if its called more than once. We can/should assume start isn't being called multiple times AFAIK.

setstate getstate are different though, start could be called after setstate I believe

jeff-hykin · 2026-03-25T07:16:19Z

dimos/core/module.py

+        with self.mod_state as state:
+            if state == "stopped":
+                raise RuntimeError(f"{type(self).__name__} cannot be restarted after stop")
+            self.mod_state.set("started")


I know lots of modules don't call super().start() but they also wouldn't be using mod_state cause its a new thing.

Different/off-topic discussion, but I think core2 should have ModuleBase as class decorator instead of an inherited class (we can basically wrap methods instead of saying "please remember to call super").

That's one of the reasons I don't like inheritance. But can you explain what you mean by ModuleBase being a decorator? At first glace that seems more complicated.

jeff-hykin · 2026-03-25T07:27:17Z

dimos/core/module.py

-        loop = getattr(self, "_loop", None)
+        # dispose of things BEFORE making aspects like rpc and _tf invalid
+        if hasattr(self, "_disposables"):
+            self._disposables.dispose()  # stops _async_thread via disposable


I think its important to move disposables up before the rpc stop and the tf stop

jeff-hykin · 2026-03-25T07:29:26Z

dimos/agents/mcp/mcp_server.py

        if self._uvicorn_server:
            self._uvicorn_server.should_exit = True
-            loop = self._loop
-            if loop is not None and self._serve_future is not None:


the loop is always there until super().stop() is called

jeff-hykin · 2026-03-25T07:32:56Z

dimos/agents/mcp/mcp_server.py

        server = uvicorn.Server(config)
        self._uvicorn_server = server
-        loop = self._loop
-        assert loop is not None


loop always there until stop is called

jeff-hykin · 2026-03-25T07:33:26Z

dimos/agents/mcp/test_mcp_server.py

+        return s.getsockname()[1]
+
+
+def test_mcp_server_lifecycle() -> None:


jeff-hykin · 2026-03-25T07:34:57Z

dimos/core/test_core.py

    assert hasattr(class_rpcs["start"], "__rpc__"), "start should have __rpc__ attribute"

-    nav._close_module()
+    nav._stop()


I'm trying to consolidate our naming to be "stop" instead of half "stop" half "close"

jeff-hykin · 2026-03-25T07:37:07Z

dimos/utils/thread_utils.py

+# ThreadSafeVal: a lock-protected value with context-manager support
+
+
+class ThreadSafeVal(Generic[T]):


this is my favorite util. I hate having _thing and _thing_lock and _thing2 and _thing2_lock, but I also hate seeing _thing being used in a method and thinking "hmm ... does _thing have a lock thats not being used?". This prevents ambiguity about what vals need locks and what vals don't

jeff-hykin · 2026-03-25T07:39:49Z

dimos/utils/thread_utils.py

+        self._thread.start()
+
+    def stop(self) -> None:
+        """Signal the thread to stop and join it.


this is probably the part that needs the most review

jeff-hykin · 2026-03-25T07:42:16Z

dimos/utils/thread_utils.py

+# safe_thread_map: parallel map that collects all results before raising
+
+
+def safe_thread_map(


Not used in this PR, but is used by the docker branch so getting it in here a bit early cause this is the util file it belongs in

jeff-hykin · 2026-03-25T07:43:04Z

dimos/utils/typing_utils.py

+
+if sys.version_info < (3, 11):
+
+    class ExceptionGroup(Exception):  # type: ignore[no-redef]  # noqa: N818


I didn't want to repeat all this cludge so I put it here. Let me know if there's a better spot

paul-nechifor · 2026-03-26T03:03:46Z

dimos/utils/thread_utils.py

+        if self._thread.is_alive() and self._thread is not threading.current_thread():
+            self._thread.join(timeout=self._close_timeout)
+
+    def join(self, timeout: float | None = None) -> None:


I don't think you need join since you're already join()-ing in stop.

paul-nechifor · 2026-03-26T03:04:26Z

dimos/utils/thread_utils.py

+        self._stopped = False
+        self._stop_lock = threading.Lock()


Why do you need _stopped and _stop_lock? You have _stop_event.

paul-nechifor · 2026-03-26T03:24:46Z

dimos/utils/thread_utils.py

+
+    def start(self) -> None:
+        """Start the underlying thread."""
+        self._stop_event.clear()


You don't need this. It's already off. If you want ModuleThread to be restartable, then you need to use another thread since threads aren't restartable.

paul-nechifor · 2026-03-26T03:27:24Z

dimos/utils/thread_utils.py

+        if start:
+            self.start()


Noooooo, don't autostart in the constructor. 😭

😈 no boilerplate

But fr, how do you feel about ModuleThread().start()

paul-nechifor · 2026-03-26T03:29:22Z

dimos/utils/thread_utils.py

+                self._worker = ModuleThread(
+                    module=self,
+                    target=self._run_loop,
+                    name="my-worker",


It would be nice if ModuleThread used self.module.__class__.__name__ as the prefix so we can just leave name blank most of the time and it still produces a useful name for debugging.

paul-nechifor · 2026-03-26T03:29:45Z

dimos/utils/thread_utils.py

+        return f"ThreadSafeVal({self._value!r})"
+
+
+# ModuleThread: a thread that auto-registers with a module's disposables


Why add this if there's a docstring below?

cause AI loves redundancy
(I'll remove it, thanks for bringing attention)

paul-nechifor · 2026-03-26T04:09:13Z

dimos/core/module.py

-    def _close_module(self) -> None:
-        with self._module_closed_lock:
-            if self._module_closed:
+    def _stop(self) -> None:


_close_module is a remnant from the the Module class hierarchy was more complicated. Some classes were skipping Module.__init__ and didn't initialize self._disposables for example. That's why I'm using hasattr(self, "_disposables") or hasattr(self, "_tf"). We didn't even have stop then.

I think it's not needed at all anymore. This could be deleted if you want and moved into def stop.

happily! I though it was a rpc vs non-rpc thing

jeff-hykin · 2026-03-26T04:48:46Z

dimos/utils/thread_utils.py

+                self._worker = ModuleThread(
+                    module=self,
+                    target=self._run_loop,
+                    name="my-worker",


Suggested change

name="my-worker",

name=self.module.__class__.__name__+"_my_worker",

- Add mod.stop() to test_process_crash_triggers_stop so watchdog, LCM, and event-loop threads are properly joined from the test thread - Filter third-party daemon threads with generic names (Thread-\d+) in conftest monitor_threads to ignore torch/HF background threads that have no cleanup API

Convert test_process_crash_triggers_stop to use a fixture that calls mod.stop() in teardown. The watchdog thread calls self.stop() but can't join itself, so an explicit stop() from the test thread is needed to properly clean up all threads. Drop the broad conftest regex filter for generic daemon thread names per review feedback.

mod.stop() is a no-op when the watchdog already called it, so capture thread IDs before the test and join new ones in teardown.

…join it

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

- Merge _stop() into stop() in ModuleBase (removes unnecessary indirection) - Update all callers of _stop() to use stop() directly - Add thread_start() convenience function that creates + starts a ModuleThread

AsyncModuleThread no longer spawns the event loop thread in __init__. The loop is created on the first call to start(), which ModuleBase.start() now calls. This means module construction no longer has side effects — no threads are spawned until the module is explicitly started.

greptile-apps · 2026-03-27T03:18:44Z

dimos/utils/thread_utils.py

+
+    @property
+    def loop(self) -> asyncio.AbstractEventLoop:


loop property typed as non-optional but can return None

self._loop is initialised to None in __init__ and is only set to a real AbstractEventLoop inside start(). The property is typed as asyncio.AbstractEventLoop (non-optional), but any caller that accesses .loop before start() has been called will receive None and get a runtime AttributeError rather than a clear error about the missing initialisation.

Consider raising explicitly:

Suggested change

@property

def loop(self) -> asyncio.AbstractEventLoop:

@property

def loop(self) -> asyncio.AbstractEventLoop:

"""The managed event loop."""

if self._loop is None:

raise RuntimeError(

f"{self._module_name} async thread has not been started; call start() first"

)

return self._loop

This would have immediately surfaced the test_mcp_server_lifecycle issue above instead of propagating a confusing AttributeError from deep inside asyncio.

paul-nechifor · 2026-03-27T19:02:22Z

dimos/agents/mcp/test_mcp_server.py

    assert response["error"]["code"] == -32601
+
+
+def _free_port() -> int:


This already exists as _find_free_port in the codebase. I've actually converted it to a fixture (find_free_port) in my PR but unlikely to be merged soon.

paul-nechifor · 2026-03-27T19:05:21Z

dimos/agents/mcp/test_mcp_server.py

+    for _ in range(40):
+        try:
+            resp = requests.post(
+                url,
+                json={"jsonrpc": "2.0", "method": "initialize", "id": 1},
+                timeout=0.5,
+            )
+            if resp.status_code == 200:
+                break
+        except requests.ConnectionError:
+            time.sleep(0.1)


Stash already has 3 instances of this exact sequance (as functions called wait_for_mcp). Please don't add a 4th that is inlined.

paul-nechifor · 2026-03-27T19:06:47Z

dimos/agents/mcp/test_mcp_server.py

+    assert data["result"]["serverInfo"]["name"] == "dimensional"
+
+    # Stop and verify it shuts down
+    server.stop()


The server should be stopped even if the test fails. The best way to do it is to use a fixture for the mcp server.

paul-nechifor · 2026-03-27T19:09:09Z

dimos/core/module.py

-        loop_thread = getattr(self, "_loop_thread", None)
-        loop = getattr(self, "_loop", None)
+        # dispose of things BEFORE making aspects like rpc and _tf invalid
+        if hasattr(self, "_disposables"):


You'll have lots of conflicts here with what Ivan is doing for disposables in his PR: #1682 .

paul-nechifor · 2026-03-27T19:10:00Z

dimos/core/module.py

+            self.rpc.stop()  # type: ignore[attr-defined]
+            self.rpc = None  # type: ignore[assignment]


Please avoid type: ignore

paul-nechifor · 2026-03-27T19:14:15Z

dimos/core/module.py

    _bound_rpc_calls: dict[str, RpcCall] = {}
-    _module_closed: bool = False
-    _module_closed_lock: threading.Lock
+    mod_state: ThreadSafeVal[ModState]


I don't like mod for module. It's not much of an abbreviation (it makes more sense for modification).

paul-nechifor · 2026-03-27T19:20:23Z

dimos/utils/thread_utils.py

@@ -0,0 +1,559 @@
+# Copyright 2025-2026 Dimensional Inc.


I think these are separate enough that they could be their own files. It's often said that "utils" is a dumping ground for any odd code, but this could be put into a dimos/core/threading/ directory.

paul-nechifor · 2026-03-27T19:22:34Z

dimos/utils/thread_utils.py

+        if self._owns_loop and self._loop is not None and self._loop.is_running():
+            self._loop.call_soon_threadsafe(self._loop.stop)
+
+        if self._thread is not None and self._thread.is_alive():


It's good to also check that this call isn't from the current thread.

paul-nechifor · 2026-03-27T19:23:31Z

dimos/utils/thread_utils.py

+        close_timeout: float = 2.0,
+    ) -> None:
+        self._close_timeout = close_timeout
+        self._stopped = ThreadSafeVal(False)


This is what Event is for.

paul-nechifor · 2026-03-27T19:28:14Z

dimos/utils/thread_utils.py

+        self._loop: asyncio.AbstractEventLoop | None = None
+        self._module_name = type(module).__name__
+
+        module._disposables.add(Disposable(self.stop))


ModuleBase has an AsyncModuleThread. It's rather odd for AsyncModuleThread to tell its owner what to do.

AsyncModuleThread shoud not take ModuleBase in its __init__. ModuleBase should register the shutdown of AsyncModuleThread

paul-nechifor · 2026-03-27T19:42:08Z

dimos/utils/thread_utils.py

+    @property
+    def loop(self) -> asyncio.AbstractEventLoop:
+        """The managed event loop."""
+        return self._loop


I'm surprised mypy doesn't complain that you're returning None when only asyncio.AbstractEventLoop is allowed. Greptile is right about the comment above.

paul-nechifor · 2026-03-27T19:43:20Z

dimos/utils/thread_utils.py

+        self._process: subprocess.Popen[bytes] | None = None
+        self._watchdog: ModuleThread | None = None
+        self._module = module
+        self._stopped = ThreadSafeVal(False)


paul-nechifor · 2026-03-27T19:45:19Z

dimos/utils/thread_utils.py

+        self.last_stdout: collections.deque[str] = collections.deque(maxlen=log_tail_lines)
+        self.last_stderr: collections.deque[str] = collections.deque(maxlen=log_tail_lines)
+
+        module._disposables.add(Disposable(self.stop))


Module should add ModuleProcess.stop to its disposables. ModuleProcess should not help itself to Module's private data like _disposables.

paul-nechifor · 2026-03-27T22:34:56Z

dimos/utils/thread_utils.py

+            return on_errors(zipped, successes, errors)  # type: ignore[return-value, no-any-return]
+        raise ExceptionGroup("safe_thread_map failed", errors)
+
+    return [outcomes[i] for i in range(len(items))]  # type: ignore[misc]


Please don't ignore.

jeff-hykin marked this pull request as draft March 25, 2026 07:03

greptile-apps bot reviewed Mar 25, 2026

View reviewed changes

jeff-hykin commented Mar 25, 2026

View reviewed changes

paul-nechifor reviewed Mar 26, 2026

View reviewed changes

jeff-hykin commented Mar 26, 2026

View reviewed changes

SUMMERxYANG and others added 9 commits March 26, 2026 20:10

CI code cleanup

6413d99

CI code cleanup

b11eb6e

chore: retrigger CI

820be9a

fix(test): join threads directly in crash_module fixture

38a31fc

mod.stop() is a no-op when the watchdog already called it, so capture thread IDs before the test and join new ones in teardown.

CI code cleanup

055b7f3

fix(native_module): preserve watchdog reference so second stop() can …

8fb0526

…join it

minimal fix

060e049

jeff-hykin and others added 7 commits March 26, 2026 20:11

misc improve

a4aba11

cleanup

5773170

Apply suggestions from code review

f62a0b0

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

-

72f38c4

fix order of _disposables

9836102

pr feedback

a1586f3

refactor: merge _stop into stop, add thread_start helper

218b72e

- Merge _stop() into stop() in ModuleBase (removes unnecessary indirection) - Update all callers of _stop() to use stop() directly - Add thread_start() convenience function that creates + starts a ModuleThread

jeff-hykin marked this pull request as ready for review March 27, 2026 03:12

jeff-hykin enabled auto-merge (squash) March 27, 2026 03:12

jeff-hykin disabled auto-merge March 27, 2026 03:13

jeff-hykin enabled auto-merge (squash) March 27, 2026 03:13

jeff-hykin force-pushed the jeff/fix/native_threading branch from 1b4450c to 4240573 Compare March 27, 2026 03:14

CI code cleanup

bac8488

greptile-apps bot reviewed Mar 27, 2026

View reviewed changes

jeff-hykin changed the title ~~Jeff/fix/native threading~~ Threading Utils (and fix for Native Module flakey test) Mar 27, 2026

paul-nechifor reviewed Mar 27, 2026

View reviewed changes

		assert done.wait(timeout=10), "Deadlock with slow ModuleThread.stop()"


		from dimos.utils.typing_utils import ExceptionGroup

		return s.getsockname()[1]


		def test_mcp_server_lifecycle() -> None:

		# ThreadSafeVal: a lock-protected value with context-manager support


		class ThreadSafeVal(Generic[T]):

		# safe_thread_map: parallel map that collects all results before raising


		def safe_thread_map(


		if sys.version_info < (3, 11):

		class ExceptionGroup(Exception): # type: ignore[no-redef] # noqa: N818

		return f"ThreadSafeVal({self._value!r})"


		# ModuleThread: a thread that auto-registers with a module's disposables

	name="my-worker",
	name=self.module.__class__.__name__+"_my_worker",

		assert response["error"]["code"] == -32601


		def _free_port() -> int:

		self.rpc.stop() # type: ignore[attr-defined]
		self.rpc = None # type: ignore[assignment]

Conversation

jeff-hykin commented Mar 25, 2026

Problem

Solution

Breaking Changes

How to Test

Contributor License Agreement

Uh oh!

greptile-apps bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

greptile-apps bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 25, 2026 •

edited

Loading

jeff-hykin Mar 25, 2026 •

edited

Loading

jeff-hykin Mar 25, 2026 •

edited

Loading

jeff-hykin Mar 25, 2026 •

edited

Loading

jeff-hykin Mar 25, 2026 •

edited

Loading

jeff-hykin Mar 26, 2026 •

edited

Loading

paul-nechifor Mar 27, 2026 •

edited

Loading