Skip to content

Lock-free fast path for buffer Pin/Unpin under concurrent readers#61

Open
krleonid wants to merge 3 commits into
ci/merge-v1.5-variegata-into-stablefrom
lock-free-pin-unpin-variegata
Open

Lock-free fast path for buffer Pin/Unpin under concurrent readers#61
krleonid wants to merge 3 commits into
ci/merge-v1.5-variegata-into-stablefrom
lock-free-pin-unpin-variegata

Conversation

@krleonid
Copy link
Copy Markdown
Owner

@krleonid krleonid commented May 30, 2026

Summary

  • Lock-free Pin/Unpin fast path: When a block is already loaded and has active readers (readers > 0), atomically increment/decrement readers via compare-and-swap without acquiring the per-block mutex. Eliminates mutex contention for the common case of hot blocks with multiple concurrent readers.
  • shared_mutex for BlockManager block registry: Replace exclusive mutex on the block registry (blocks_lock) with std::shared_mutex. TryGetBlock and BlockIsRegistered now acquire a shared (read) lock, allowing concurrent scans to look up block handles in parallel. RegisterBlock uses a fast shared-lock check before falling back to exclusive lock for insertion.

Benchmark results

Lock-free Pin/Unpin (single table, SELECT * WHERE itemId IN <1000 ids>, 95K rows, 9284 segments)

Connections Before After Improvement
Single ~95ms ~92ms no regression
10 concurrent 562ms 193ms 66% faster
20 concurrent 1177ms 497ms 58% faster

shared_mutex on blocks_lock (multi-DB, 194-column table, SAMPLE 1000 rows)

Setup Before After Improvement
20 DBs, 20 conns 59.3 qps / 334ms 66.1 qps / 300ms 10% faster
40 DBs, 40 conns 56.8 qps / 699ms 60.7 qps / 655ms 6% faster
10 DBs, 16 conns 61.7 qps / 259ms 68.5 qps / 233ms 10% faster

Test plan

  • Verify single-connection queries show no regression
  • Verify concurrent wide-table scans show reduced latency
  • Verify no crashes under concurrent Pin/Unpin stress
  • Verify RegisterBlock/UnregisterBlock still work correctly under concurrent load

🤖 Generated with Claude Code

When a block is already loaded and has active readers (readers > 0),
atomically increment/decrement readers via compare-and-swap without
acquiring the per-block mutex. This eliminates mutex contention for
the common case of hot blocks with multiple concurrent readers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@krleonid krleonid force-pushed the lock-free-pin-unpin-variegata branch 2 times, most recently from cdcd392 to 5617e3b Compare May 30, 2026 23:51
…ntention

Replace the exclusive mutex on the block registry with a platform-specific
read-write lock. TryGetBlock and BlockIsRegistered acquire a read lock,
allowing concurrent scans to look up block handles without serializing.
RegisterBlock uses a fast read-lock check before falling back to a write
lock for insertion. UnregisterBlock takes a write lock.

Uses pthread_rwlock_t on POSIX and SRWLOCK on Windows. Platform headers
are isolated in the .cpp file to avoid polluting the header namespace.
RAII guards (ReadLockGuard/WriteLockGuard) ensure exception-safe unlock.

Benchmark (20 DBs, 20 connections, 194-column table, 1000 rows/query):
  Before: 56.8 qps, 698.8ms avg latency
  After:  60.7 qps, 654.6ms avg latency (+6.9%)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@krleonid krleonid force-pushed the lock-free-pin-unpin-variegata branch from 5617e3b to a17746f Compare May 31, 2026 06:33
Log when Purge takes >1000ms or >10 iterations, reporting queue size,
dead nodes before/after, and elapsed time. Helps diagnose buffer pool
performance degradation in production.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@krleonid krleonid force-pushed the lock-free-pin-unpin-variegata branch from 5015e1f to 4ef7a1c Compare May 31, 2026 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant