Skip to content

estuary-cdk: implement ConcurrentTask for concurrent webhook writes#4372

Open
nicolaslazo wants to merge 1 commit into
mainfrom
nlazo/cdk-webhook-independent-tasks
Open

estuary-cdk: implement ConcurrentTask for concurrent webhook writes#4372
nicolaslazo wants to merge 1 commit into
mainfrom
nlazo/cdk-webhook-independent-tasks

Conversation

@nicolaslazo
Copy link
Copy Markdown
Contributor

Description:

The current Task implementation gives callers manual control over how many documents are buffered before each checkpoint. That model works when there is a single producer per Task, but our aiohttp webhook server handles an arbitrary number of concurrent requests across the same Task; this exposes us to the risk that concurrent checkpoint() calls may interleave on the shared buffer.

This change introduces a ConcurrentTask class that wraps Task with an atomic emit() method that atomically writes buffers and checkpoints together.

ConcurrentTask also gives each binding its own buffer, replacing the previous stopgap where a single mutex was shared across every binding in the webhook server.

Closes #4322.

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

Notes for reviewers:

(anything that might help someone review this PR)

@nicolaslazo nicolaslazo requested a review from Alex-Bair May 6, 2026 18:50
@nicolaslazo nicolaslazo self-assigned this May 6, 2026
@nicolaslazo nicolaslazo force-pushed the nlazo/cdk-webhook-independent-tasks branch from 709683a to 4af1232 Compare May 8, 2026 19:29
@nicolaslazo nicolaslazo force-pushed the nlazo/cdk-webhook-transactor branch 5 times, most recently from 0ad5aee to 4ae0f7b Compare May 12, 2026 19:57
Base automatically changed from nlazo/cdk-webhook-transactor to main May 13, 2026 11:44
The current Task implementation gives callers manual control over how
many documents are buffered before each checkpoint. That model works
when there is a single producer per Task, but our aiohttp webhook server
handles an arbitrary number of concurrent requests across the same Task;
this exposes us to the risk that concurrent checkpoint() calls may
interleave on the shared buffer.

This change introduces a ConcurrentTask subclass with an atomic `emit()`
method that atomically writes buffers and checkpoints together.

ConcurrentTask also gives each binding its own buffer, replacing the
previous stopgap where a single mutex was shared across every binding in
the webhook server.
@nicolaslazo nicolaslazo force-pushed the nlazo/cdk-webhook-independent-tasks branch from 4af1232 to 2c56f61 Compare May 13, 2026 11:58
Copy link
Copy Markdown
Member

@Alex-Bair Alex-Bair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I originally provided the direction of instantiating separate Tasks for each webhook binding, but the more I think about the problem we're trying to solve plus seeing that the existing source-appsflyer captures aren't throughput limited by sharing a single Task, I'd rather we not take the "one Task per webhook binding" approach right now; it's an optimization I thought we'd need, but based on empirical evidence, we don't need to take on that additional complexity & improve throughput here yet. I apologize for changing my mind & switching direction late on this one. 🙇

The messy bit remaining to clean up is ensuring a Task doesn't attempt to write to its buffer while its buffer is being drained. We're using locks to coordinate this whenever we're using the various Task methods, but IMO it'd be cleaner to push this complexity down into the buffer itself; if the buffer guaranteed no write would ever happen when it's being drained, that'd simplify a lot of the logic elsewhere in the CDK now & in the future.

So, what I'm proposing is we:

  • still use a single Task for all webhook bindings
  • wrap the Task's buffer in a ConcurrentBuffer class whose writes and drains are safe for concurrent producers. A ConcurrentBuffer would be an overall improvement for all connectors, albeit it's an improvement that wasn't needed until we started capturing webhooks.

I've pushed a spike implementation of this up to the bair/estuary-cdk-concurrent-buffer-spike branch for reference of what I'm thinking. Do you think that's a suitable alternative or see something I'm missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants