Skip to content

perf(mcp): sync vault I/O blocks event loop, causes cross-tool timeouts #66

@scops

Description

@scops

Summary

Several MCP tools (engrama_remember, engrama_relate, engrama_sync_note, engrama_sync_vault, engrama_write_insight_to_vault, engrama_status) make synchronous filesystem calls into the ObsidianAdapter from inside async tool handlers. Because the MCP server runs on a single-threaded asyncio event loop, every slow vault I/O blocks every other tool call until it returns.

In practice this manifests as engrama_relate and engrama_surface_insights hitting the 4-minute client timeout while engrama_status keeps replying instantly — the server looks half-alive, but is actually serialised behind a stuck vault write.

Symptoms

  • Vault hosted on a cloud-sync drive (Proton Drive, OneDrive, Dropbox…) where the sync agent may briefly lock or stall files immediately after a write.
  • Calls that were fast in v0.10.x began timing out after v0.11.0, which added automatic vault note creation inside engrama_relate and therefore expanded the surface area of sync I/O.
  • During a stall, parallel tools queue behind the blocked one and time out together.

Root cause

engrama/adapters/mcp/server.py makes ~15 direct calls to sync vault methods from inside async def handlers:

  • obsidian.list_notes(...) (status, sync_vault)
  • obsidian.read_note(...) (remember, relate, sync_note, sync_vault, write_insight_to_vault)
  • obsidian.add_relation(...) (remember inline-relations, relate)
  • target.write_text(...) (remember, relate, sync_note, sync_vault, write_insight_to_vault)

While any one of these is waiting on disk I/O, the event loop is frozen. The server stops processing other tool requests until the call returns.

Quantitative validation

Local repro with a synthetic 1.5 s sleep injected into add_relation:

  • Before fix: engrama_status invoked in parallel waits the full 1.5 s.
  • After fix (wrapping in asyncio.to_thread): engrama_status responds in ~6 ms.

Fix outline

Wrap each of the ~15 sync vault calls in asyncio.to_thread(...). This punts the blocking I/O to a worker thread and frees the event loop to service other tools concurrently. No API or behaviour changes — same calls, same arguments, just off the main loop.

Acceptance criteria

  • Every sync ObsidianAdapter call and direct pathlib.Path.write_text inside MCP tool handlers runs through asyncio.to_thread.
  • A concurrency smoke check confirms the event loop stays responsive under slow vault I/O (manual repro is sufficient; the harness fixtures don't easily simulate slow drives).
  • All existing tests still pass.

Out of scope

  • engrama_reflect itself may also deserve thread offload if its Cypher queries are long-running; left for a follow-up.
  • The MCP MeTaTa Insight ordering / scoring issues are unrelated.

Discovered in

v0.11.0, after the addition of automatic vault note creation inside engrama_relate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions