Skip to content

samod-core: fix DontAnnounce policy drops documents synced by clients#85

Closed
shikokuchuo wants to merge 1 commit intoalexjg:mainfrom
shikokuchuo:dev
Closed

samod-core: fix DontAnnounce policy drops documents synced by clients#85
shikokuchuo wants to merge 1 commit intoalexjg:mainfrom
shikokuchuo:dev

Conversation

@shikokuchuo
Copy link
Copy Markdown
Contributor

Fixes #84.

We've cherry-picked the commit from our repro to target main.

If you have any questions just let us know, thanks!

Root cause

In samod-core/src/actors/document/doc_state.rs, handle_load():

  1. Client syncs a new document to the server.
  2. The server spawns a document actor in Loading phase and queues the client's sync message in pending_sync_messages.
  3. Two async tasks are dispatched: storage load + announce policy check.
  4. Storage load returns empty (document is new, nothing on disk).
  5. Announce policy resolves to DontAnnounce (the server's policy is |_, _| false).
  6. handle_load checks: doc.get_heads().is_empty() is true, and eligible_conns (connections with non-DontAnnounce policy) is false.
  7. Bug: transitions to NotFound, dropping all pending_sync_messages — the client's document data is lost.

The pending sync messages contain the actual document data from the client, but they are never processed.

Fix

  • Before transitioning to NotFound, check whether there are pending sync messages. If there are, process them first — they may contain the document data. Only transition to NotFound when there are no pending messages AND no eligible connections.

cc. @cscheid

@cscheid
Copy link
Copy Markdown
Contributor

cscheid commented Mar 12, 2026

To give a bit more context, Charlie works on Quarto Hub, the collaborative editor for Quarto docs we started to work on. We found this while using samod as a sync server, and this specifically was triggered when creating a large number of automerge documents quickly in succession (this happens when we create a new Quarto project, which involves the creation of more than one automerge documents at once).

alexjg added a commit that referenced this pull request Mar 13, 2026
Problem: when we receive a sync message for a document which we don't
have in storage and for whom the AnnouncePolicy returns DontAnnounce
then we erroneously decide that the document is unavailable even if the
incoming sync message contains data about the document. This is because
we fail to process pending sync messages once the load has completed if
we don't have the document available or any connected peers who we could
request from.

Solution: process pending sync messages before deciding that the
document is unavailable.

Whilst I'm here I also cleaned up the logic around the phase transition
during load to make it more consistent with the rest of the phase
transitions and easier to read.

Fixes: #85

Co-authored-by: 285675+cscheid@users.noreply.github.com
Co-authored-by: 53399081+shikokuchuo@users.noreply.github.com
alexjg added a commit that referenced this pull request Mar 13, 2026
Problem: when we receive a sync message for a document which we don't
have in storage and for whom the AnnouncePolicy returns DontAnnounce
then we erroneously decide that the document is unavailable even if the
incoming sync message contains data about the document. This is because
we fail to process pending sync messages once the load has completed if
we don't have the document available or any connected peers who we could
request from.

Solution: process pending sync messages before deciding that the
document is unavailable.

Whilst I'm here I also cleaned up the logic around the phase transition
during load to make it more consistent with the rest of the phase
transitions and easier to read.

Fixes: #85

Co-authored-by: Carlos Scheidegger <285675+cscheid@users.noreply.github.com>
Co-authored-by: shikokuchuo <53399081+shikokuchuo@users.noreply.github.com>
alexjg added a commit that referenced this pull request Mar 13, 2026
Problem: when we receive a sync message for a document which we don't
have in storage and for whom the AnnouncePolicy returns DontAnnounce
then we erroneously decide that the document is unavailable even if the
incoming sync message contains data about the document. This is because
we fail to process pending sync messages once the load has completed if
we don't have the document available or any connected peers who we could
request from.

Solution: process pending sync messages before deciding that the
document is unavailable.

Whilst I'm here I also cleaned up the logic around the phase transition
during load to make it more consistent with the rest of the phase
transitions and easier to read. One important improvement is that if the
document state changes mutliple times in one turn of a document actor,
we only notify of the last status. This is important because it means
that if a document transitions from loading through not found and into
requesting in the same turn (which can happen if a load completes after
receiving a sync message) then we don't notify the hub of the not found
state. This is in turn important because notifying the hub of a not
found state causes any outstanding find commands to complete with
`None`.

Fixes: #85

Co-authored-by: Carlos Scheidegger <285675+cscheid@users.noreply.github.com>
Co-authored-by: shikokuchuo <53399081+shikokuchuo@users.noreply.github.com>
@alexjg alexjg closed this in #86 Mar 13, 2026
@alexjg
Copy link
Copy Markdown
Owner

alexjg commented Mar 13, 2026

Thanks for the PR! I fixed this in a slightly different way but I've included you both as co-authors on the commit.

@shikokuchuo shikokuchuo deleted the dev branch March 13, 2026 16:31
shikokuchuo added a commit to shikokuchuo/samod that referenced this pull request Mar 13, 2026
Problem: when we receive a sync message for a document which we don't
have in storage and for whom the AnnouncePolicy returns DontAnnounce
then we erroneously decide that the document is unavailable even if the
incoming sync message contains data about the document. This is because
we fail to process pending sync messages once the load has completed if
we don't have the document available or any connected peers who we could
request from.

Solution: process pending sync messages before deciding that the
document is unavailable.

Whilst I'm here I also cleaned up the logic around the phase transition
during load to make it more consistent with the rest of the phase
transitions and easier to read. One important improvement is that if the
document state changes mutliple times in one turn of a document actor,
we only notify of the last status. This is important because it means
that if a document transitions from loading through not found and into
requesting in the same turn (which can happen if a load completes after
receiving a sync message) then we don't notify the hub of the not found
state. This is in turn important because notifying the hub of a not
found state causes any outstanding find commands to complete with
`None`.

Fixes: alexjg#85

Co-authored-by: Carlos Scheidegger <285675+cscheid@users.noreply.github.com>
Co-authored-by: shikokuchuo <53399081+shikokuchuo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DontAnnounce policy drops documents synced by clients

3 participants