Skip to content

perf(bindings): zero-copy VectorAssignByteElement; legacy []byte copy unchanged#84

Merged
taniabogatsch merged 4 commits into
duckdb:mainfrom
adubovikov:perf/upstream-pr/2-hot-bind-paths
May 28, 2026
Merged

perf(bindings): zero-copy VectorAssignByteElement; legacy []byte copy unchanged#84
taniabogatsch merged 4 commits into
duckdb:mainfrom
adubovikov:perf/upstream-pr/2-hot-bind-paths

Conversation

@adubovikov
Copy link
Copy Markdown
Contributor

@adubovikov adubovikov commented May 15, 2026

Summary

High-frequency bind/vector paths for duckdb-go Appender workloads:

  • Unchanged: VectorAssignStringElementLen([]byte) still copies via C.CBytes (safe when the Go slice is reused).
  • New opt-in: VectorAssignByteElement — zero-copy UTF-8 assign (duckdb_vector_assign_string_element_len + slice pointer).
  • New opt-in: UnsafeVectorAssignStringElementLen — zero-copy, no UTF-8 validation (true borrow; was incorrectly copying via C.CBytes before).
  • BindVarchar / BindBlob / CreateBlob: avoid C.CString / C.CBytes where the C API accepts length + pointer.
  • UTF8Bytes / UnsafeUTF8Bytes marker types + VectorAssignUTF8Bytes helper.
  • Tests: TestVectorAssignByteElement_zeroGoAllocPerAssign.

Part 2/4 of perf series; depends on #83.

Companion: duckdb/duckdb-go#138 (AppendBytes / AppendBytesUnsafe in Appender).

Test plan

  • go test ./...
  • bindings_alloc_bench_test.go benchmarks

Use duckdb_bind_varchar_length, direct slice pointers for blobs and
UTF-8 checks, duckdb_create_varchar_length, and vector assign length
APIs. Add micro-benchmarks and TestCreateVarcharEmpty.
@adubovikov adubovikov force-pushed the perf/upstream-pr/2-hot-bind-paths branch from f272498 to c92882c Compare May 18, 2026 08:39
@adubovikov
Copy link
Copy Markdown
Contributor Author

Rebased onto current main (includes merged #83). Branch is now one commit on top of upstream/main: c92882c (same hot-path/bindings.go changes as before; bindings_alloc_bench_test.go merged #83 enum-only section with PR2 tests/benches). Force-pushed perf/upstream-pr/2-hot-bind-paths.

@adubovikov adubovikov marked this pull request as ready for review May 18, 2026 08:41
@adubovikov
Copy link
Copy Markdown
Contributor Author

@mlafeldt please review :-)

Document VectorAssignUTF8Bytes wrapper and prove VectorAssignStringElementLen
does not allocate in Go on the hot path (no C.CBytes).
VectorAssignStringElementLen restores C.CBytes (safe for reused slices).
New VectorAssignByteElement and fixed UnsafeVectorAssignStringElementLen
provide zero-copy []byte assign for opt-in callers (duckdb.AppendBytes).
@adubovikov adubovikov changed the title perf: avoid CString/CBytes on varchar/blob/vector hot paths (2/4) perf(bindings): zero-copy VectorAssignByteElement; legacy []byte copy unchanged May 18, 2026
adubovikov added a commit to adubovikov/duckdb-go that referenced this pull request May 18, 2026
Plain []byte in AppendRow keeps VectorAssignStringElementLen (C.CBytes copy).
AppendBytes / AppendBytesUnsafe opt into VectorAssignByteElement when available
in duckdb-go-bindings (requires duckdb/duckdb-go-bindings#84).

json.RawMessage in setJSON maps to AppendBytesUnsafe.
Comment thread bindings.go Outdated
Comment on lines +1349 to +1350
length := C.int(StringTLength(*strT))
// duckdb_string_t_length takes duckdb_string_t by value; duckdb_string_t_data takes a pointer.
length := C.int(C.duckdb_string_t_length(*strT))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing this? Is the function call into StringTLength confirmed to cause overhead? I would've expected the compiler to catch this - I'll revert this in my commit.

Comment thread vector_bytes.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, then this file only adds helpers that are not strictly speaking bindings for the existing API. I'll remove them.

@taniabogatsch taniabogatsch merged commit 55ce64c into duckdb:main May 28, 2026
13 checks passed
@taniabogatsch
Copy link
Copy Markdown
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants