Single process horizontal scaling limitations #79

gerardatkonvo · 2026-04-08T08:41:15Z

gerardatkonvo
Apr 8, 2026

Hi. Cool project! We're currently using BullMQ Pro at Konvo and exploring alternatives with a clear migration path.

I noticed on the comparison page that horizontal scaling is listed as single-process. I understand bunqueue supports multiple worker clients over TCP, so the worker layer scales — but the broker itself is a single process bound to one SQLite instance.

Our current BullMQ deployment peaks at around 300k messages/sec, which is roughly 10x the benchmarked throughput of a single bunqueue broker. So for us, the distributed scaling question isn't theoretical — it's a hard requirement.

Two questions:

Is the single-process broker a permanent architectural constraint, or is there a roadmap for distributed broker support? I'm thinking along the lines of LiteFS, rqlite, or Turso/libSQL as a persistence backend that would allow multi-node replication or failover.
What's the practical ceiling of the single broker? At what point (jobs/sec, queue count, payload size) should we expect to hit limits?

Would love to understand the long-term scaling story before committing to a migration.

gerardatkonvo · 2026-04-08T08:45:47Z

gerardatkonvo
Apr 8, 2026
Author

Just to be clear, our 300k messages/sec is only under heavy load and not a permanent thing. Users send marketing broadcast to thousands of users and each broadcast needs multiple broker messages to get contact information, enrichment, ask the AI to create a message fit for the user, send it, save it to our DB, receive back the webhook mark it as received, etc...

Horizontal scaling allows us to easily spin up new broker processes under heavy load and then scale down when those heavy broadcasts are finished.

0 replies

egeominotti · 2026-04-08T10:29:22Z

egeominotti
Apr 8, 2026
Maintainer

Hi @gerardatkonvo — thanks for the detailed framing. 300k msg/sec on BullMQ Pro is a serious workload, and the question deserves a concrete technical answer rather than a hand-wave. Let me address both points directly.

On the architectural constraint and the LiteFS / rqlite / libSQL angle

The single-process broker is not a permanent constraint, but I want to be upfront that none of LiteFS, rqlite, or Turso/libSQL would actually solve the scaling problem you're describing. They address durability and failover, not horizontal write throughput, and the reason matters for the broader roadmap discussion.

bunqueue's hot path is dominated by in-memory data structures (sharded priority queues, jobIndex map, lock table, scheduler heap) running inside a single Bun event loop. SQLite is a downstream sink: writes are batched every 10ms by the WriteBuffer and flushed in bulk. The benchmark numbers (~287k ops/sec embedded, ~149k TCP for bulk push) are CPU-bound on the broker process, not I/O-bound on SQLite. Replacing SQLite with a replicated backend would not move the ceiling — it would only add latency.

A second invariant matters here: in bunqueue, pull is a write. Moving a job from waiting to active must be atomic across all consumers, otherwise two workers process the same job. This rules out the "primary + read replicas" topologies that LiteFS, sqld, and embedded libSQL replicas offer as a way to scale reads. Workers cannot read from replicas; they would all funnel through the primary regardless. rqlite is even less suitable: every write incurs Raft consensus overhead, which is roughly an order of magnitude slower than local SQLite for the kind of small-batch writes bunqueue performs. It would degrade throughput, not improve it.

So those backends are a viable option for HA (zero-downtime upgrades, primary failover with low RPO) but not for horizontal scaling.

The actual roadmap for distributed brokers

True multi-broker support is on the roadmap, and it is planned as a bunqueue Pro offering. The architecture I'm targeting is queue-name sharding via consistent hashing across a cluster of independent broker processes, similar in spirit to Redis Cluster or partitioned BullMQ deployments. Each broker remains autonomous, with its own local SQLite, no consensus protocol on the hot path, and no coordination overhead per operation. A client-side router (or a thin proxy layer) maps queue names to brokers via consistent hashing with virtual nodes, and the worker layer connects directly to the broker that owns the queue it's consuming.

This preserves the throughput characteristics of a single broker per node while letting aggregate capacity scale linearly with cluster size. Four brokers reach ~600k ops/sec, eight reach ~1.2M, and so on. HA is layered on top via per-shard primary/replica pairs (LiteFS-style replication is appropriate here, since within a shard you're back to the single-writer model where it works well). The Pro tier would also include the operational tooling that becomes mandatory in a clustered setup: rebalancing, drain protocols, cross-broker stats aggregation, and a coordinator for cron schedulers to prevent duplicate fires.

The honest trade-off you should know about: queue-name sharding does not split a single hot queue across nodes. If your 300k msg/sec lives on one or two logical queues, partitioning won't help — you would need to introduce sub-queue keying at the producer (emails-shard-0, emails-shard-1, …) or accept the single-broker ceiling for that queue. If the volume is distributed across many queues, the cluster model fits naturally.

I cannot commit to a public timeline for the Pro release yet, but I can confirm it is the direction and that I'd be happy to keep you updated as the design firms up.

On the practical ceiling of the single broker

Concrete numbers from the published benchmarks, so you can do the math against your workload:

Mode	Push	Bulk push	End-to-end (push + process)
Embedded	~205k ops/sec	~287k ops/sec	~75k jobs/sec
TCP	~52k ops/sec	~149k ops/sec	~33k jobs/sec

These are sustained at 50k jobs in the harness; the broker scales smoothly up to that range without degradation. The realistic ceiling for production traffic is the end-to-end column: ~75k jobs/sec embedded, ~33k jobs/sec TCP. With durable: true (per-job fsync, no write batching), that drops to ~10k jobs/sec — that's the SQLite-bound regime.

Other limits to be aware of:

CPU: bunqueue is single-threaded on the broker hot path. More cores do not help a single broker; they only help if you shard queues.
Queue count: tested in the thousands of active queues per broker, no hard limit observed.
Payload size: msgpack handles small to medium payloads efficiently. For jobs larger than ~256KB, the recommendation is to store the payload in object storage (S3, R2) and pass a reference in the job data.
Memory: bounded collections (completedJobs 50k, jobResults 5k LRU, jobLogs 10k LRU, customIdMap 50k LRU). At very high throughput, completed-job retention is the dominant memory pressure; tune removeOnComplete accordingly.

For your specific case

Being direct: at 300k msg/sec, single-broker bunqueue is not currently a fit, and I would not want you to migrate into a wall. The realistic options I see for Konvo are:

Wait for bunqueue Pro / Cluster if your volume is distributed across many queues and you have flexibility on the timeline.
Adopt the OSS broker now for the durable / cron / DLQ-rich subset of your workload and keep your existing high-throughput broker for the fast path. A mixed deployment is reasonable while we close the gap.
Stay on a natively distributed primitive (NATS JetStream, Kafka, Redis Cluster) for the hot queues and revisit bunqueue once the Pro cluster mode ships.

Happy to go deeper on any of these or to schedule a short call if it helps. And if Konvo's workload profile would shape the Pro design decisions in a useful direction, I'd genuinely value the input.

— Egeo

1 reply

gerardatkonvo Apr 8, 2026
Author

Thanks for the honesty and quick response @egeominotti.

While I think bunmq is exactly what we are looking for, unfortunately multi-broker support is a non-negotiable for us. But thanks for confirming that it is indeed something that was already in mind.

We are really impressed by the benchmarks, the architectural and API design decisions made. We'll keep following the updates on this project and by any means please ping me again (here or in LinedIn) once this you've made advances on the matter.

Cheers.

egeominotti · 2026-04-08T10:58:15Z

egeominotti
Apr 8, 2026
Maintainer

Thanks @gerardatkonvo — that's a perfectly reasonable bar, and honestly the right call for the workload profile you described. I'd give Konvo the same advice in your position.

The kind words on architecture and API design genuinely mean a lot, especially coming from a team running at that scale. I'll absolutely ping you here (or on LinkedIn — happy to connect) the moment the Pro cluster work is in a state worth re-evaluating. And if at any point Konvo would be open to sharing more about your queue topology and job-mix profile, that kind of real-world input from a workload at your size would meaningfully shape the design priorities — no commitment expected, just an open door if it's useful on your end.

Wishing you a smooth path forward with the broadcasts, and thanks again for taking the time to make the case so clearly.

— Egeo

0 replies

otoolep · 2026-04-16T01:07:39Z

otoolep
Apr 16, 2026

rqlite is even less suitable: every write incurs Raft consensus overhead, which is roughly an order of magnitude slower than local SQLite for the kind of small-batch writes bunqueue performs. It would degrade throughput, not improve it. So those backends are a viable option for HA (zero-downtime upgrades, primary failover with low RPO) but not for horizontal scaling.

rqlite creator here. That is correct. rqlite is not about horizontal write scaling. Good to see correct descriptions of what it does, many folks have an unclear idea of what Raft based systems actually do.

1 reply

egeominotti Apr 22, 2026
Maintainer

Thanks for stopping by, @otoolep, genuinely appreciated, and the validation from the source carries weight.

rqlite has been a reference point for me when thinking through the design space, precisely because it's one of the cleanest articulations of "Raft gives you HA and strong consistency, not write scalability." A lot of the confusion in this space comes from conflating those axes, and projects that are explicit about the trade-off, like rqlite is in its docs, make the conversation with users much easier.

For what it's worth, the HA story in bunqueue Pro will almost certainly borrow patterns from the Raft-based ecosystem for the per-shard replication layer, even though the horizontal-scale layer above it is sharding-based rather than consensus-based. If you're ever open to a short exchange on the failure modes you've seen in production deployments of rqlite, particularly around snapshotting under write pressure and leader-change behavior during rolling upgrades, I'd value the input. No expectation, just an open door.

Thanks again for the note, and for rqlite.

otoolep · 2026-04-22T20:31:11Z

otoolep
Apr 22, 2026

Yes, I would be happy to connect. If you'd like to talk 1x1 maybe book an rqlite Office Hours? https://calendly.com/philipomailbox-calendly/rqlite-office-hours

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single process horizontal scaling limitations #79

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Single process horizontal scaling limitations #79

Uh oh!

gerardatkonvo Apr 8, 2026

Replies: 5 comments · 2 replies

Uh oh!

gerardatkonvo Apr 8, 2026 Author

Uh oh!

Uh oh!

egeominotti Apr 8, 2026 Maintainer

On the architectural constraint and the LiteFS / rqlite / libSQL angle

The actual roadmap for distributed brokers

On the practical ceiling of the single broker

For your specific case

Uh oh!

gerardatkonvo Apr 8, 2026 Author

Uh oh!

egeominotti Apr 8, 2026 Maintainer

Uh oh!

Uh oh!

otoolep Apr 16, 2026

Uh oh!

egeominotti Apr 22, 2026 Maintainer

Uh oh!

otoolep Apr 22, 2026

gerardatkonvo
Apr 8, 2026

Replies: 5 comments 2 replies

gerardatkonvo
Apr 8, 2026
Author

egeominotti
Apr 8, 2026
Maintainer

gerardatkonvo Apr 8, 2026
Author

egeominotti
Apr 8, 2026
Maintainer

otoolep
Apr 16, 2026

egeominotti Apr 22, 2026
Maintainer

otoolep
Apr 22, 2026