Skip to content

Eliminate CGO overhead in row scanning hot path#1

Merged
krleonid merged 2 commits into
v2.6.0-16.previewfrom
eliminate-cgo-scan-overhead
May 26, 2026
Merged

Eliminate CGO overhead in row scanning hot path#1
krleonid merged 2 commits into
v2.6.0-16.previewfrom
eliminate-cgo-scan-overhead

Conversation

@krleonid
Copy link
Copy Markdown
Owner

Summary

  • Replace per-cell CGO calls in getNull and getBytes with pure Go pointer arithmetic, reducing ~500K CGO boundary crossings per 1000-row × 250-column query to zero
  • Bypass verifyAndRewriteColIdx in rows.Next by calling getFn directly (no bounds/projection check needed on the read path)

Measured improvement

Single query, 1000 rows × 250 columns, no concurrency:

Metric Before After
Avg scan time 16.4ms 8.1ms
Per-row scan 16.4µs 8.1µs

Under 90% CPU pressure (10 concurrent workers):

Metric Before After
Avg scan time 432ms 355ms
Max scan time 1.19s 922ms

Changes

  • vector_getters.go: getNull — inline bit manipulation instead of mapping.ValidityMaskValueIsValid CGO call
  • vector_getters.go: getBytes — read duckdb_string_t layout directly instead of mapping.StringTData (2 CGO calls + C.GoBytes copy)
  • rows.go: Next() — call getFn directly, skipping GetValue/verifyAndRewriteColIdx

Test plan

  • go test ./... passes (full test suite)
  • Run concurrent_select_storm benchmark under CPU pressure
  • Run duckdb-fetch-bench from mdb-engine to compare per-row Next/Scan times

🤖 Generated with Claude Code

krleonid and others added 2 commits May 26, 2026 22:39
Replace per-cell CGO calls in getNull and getBytes with pure Go pointer
arithmetic, reducing ~500K CGO boundary crossings per 1000-row × 250-column
query to zero. Bypass verifyAndRewriteColIdx in rows.Next by calling getFn
directly.

Measured improvement (single query, 1000 rows × 250 cols):
- Scan time: 16.4ms → 8.1ms (2× faster)
- Under 90% CPU pressure: max scan 1.2s → 0.9s

Changes:
- vector_getters.go: getNull uses inline bit manipulation instead of
  mapping.ValidityMaskValueIsValid CGO call
- vector_getters.go: getBytes reads duckdb_string_t layout directly
  instead of mapping.StringTData (2 CGO calls + C.GoBytes)
- rows.go: Next() calls getFn directly, skipping GetValue/verifyAndRewriteColIdx

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@krleonid krleonid merged commit 2040d70 into v2.6.0-16.preview May 26, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant