Skip to content

feat: QWP ingestion protocol#130

Open
jerrinot wants to merge 70 commits into
mainfrom
jh_experiment_new_ilp
Open

feat: QWP ingestion protocol#130
jerrinot wants to merge 70 commits into
mainfrom
jh_experiment_new_ilp

Conversation

@jerrinot

@jerrinot jerrinot commented May 25, 2026

Copy link
Copy Markdown
Contributor

WIP

Summary by CodeRabbit

  • New Features

    • QWP/UDP and QWP/WebSocket senders, FSN polling/drain APIs, Buffer.ilp/Buffer.qwp factories, structured QWP/WebSocket diagnostics, TLS roots password and retry backoff controls, and a QWP/UDP example script.
  • Documentation

    • Expanded QWP/UDP and QWP/WSS docs: default port 9007, connection options (datagram sizing, multicast TTL), auto-flush/retry behavior, and protocol guidance.
  • Tests

    • Extensive QWP/UDP and QWP/WebSocket unit and system tests plus uninitialized-buffer coverage.
  • Chores

    • Updated client subproject reference, Rust build features, and CI JDK/checkout steps.

Review Change Stack

jerrinot and others added 14 commits April 14, 2026 13:53
… columns

Cover all column types over QWP/UDP transport with server-side
validation: TimestampMicros, TimestampNanos, datetime (TIMESTAMP /
TIMESTAMP_NS), f64 arrays (contiguous + transposed), decimal into
pre-created DECIMAL(18,3) table, and VARCHAR string columns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-flush: bytes threshold triggers, row count triggers, disabled
mode only sends on explicit flush.

Buffer lifecycle: reuse after flush, independent buffers don't
interfere, flush(clear=False) retains content, standalone Buffer.qwp()
clear and reuse across multiple flushes.

Edge cases: multi-table rows in one flush, Unicode symbols/columns
(Zürich, 你好世界, 🚀), None values silently skipped, empty flush is
no-op, double close, context manager flushes on exit, ServerTimestamp
vs explicit TimestampNanos, max_name_len enforcement.

Stress: 500 rows exercising multiple datagrams and auto-flush.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-flush interval: verify time-based trigger after 500ms.
Datagram splitting: 30 rows with max_datagram_size=200, single flush
forces Rust to split across multiple datagrams — all arrive.
Interleave HTTP and QWP/UDP senders in same process.
from_env construction via QDB_CLIENT_CONF env var.
Sender reuse: close and re-create, both sessions' data arrives.
Large string: 1000-char value roundtrips through QWP/UDP.
Symbols-only row (no columns).
Mixed timestamps: ServerTimestamp and explicit TimestampNanos in same batch.
DataFrame with datetime column as designated timestamp (at='ts').
new_buffer inherits init_buf_size and max_name_len from sender.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirrors the existing HTTP test_decimal_pyarrow but over QWP/UDP:
PyArrow decimal128(18,2) column flushed via sender.dataframe() into
a pre-created DECIMAL(18,3) table, values verified server-side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Zero, negative zero, max precision (999999999999999.999), min non-zero
(0.001), NaN/Inf rejection, multiple decimal columns with different
scales, and PyArrow decimal128 with null values in DataFrame.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Buffer type mismatch: ILP buffer rejected by QWP sender, QWP buffer
rejected by HTTP sender — Rust layer returns clear error messages.

Network edge cases: wrong port flushes silently (UDP fire-and-forget),
unresolvable host fails at establish() with DNS error.

Data shape: wide row (50 columns + 5 symbols), row ordering across
multiple datagrams (100 rows split with max_datagram_size=200),
tiny datagram (max_datagram_size=1) rejected at flush.

Load: 2000 rows rapid-fire with pure auto-flush, accept >= 90%
arrival (UDP may drop under load).

Config: protocol_version in QWP conf string rejected.

Concurrency: two senders from different threads to same port.

Lifecycle: double establish rejected, establish after close rejected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The designated timestamp column type is TIMESTAMP_NS (not TIMESTAMP)
on newer QuestDB when using TimestampNanos. Use col_types dict lookup
instead of full columns list comparison to avoid brittleness.

Verified: 77/77 system tests pass with QDB_REPO_PATH.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- sender.rst overview: add QWP/UDP alongside ILP/TCP and ILP/HTTP
- sender.rst tips: note QWP/UDP for fire-and-forget use cases
- conf.rst: note protocol_version is not applicable for QWP/UDP

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers: no delivery guarantee, no error feedback, buffer inspection
differences (bytes() empty, len() is estimate), standalone buffer
requirements (Buffer.qwp()), auto-flush byte defaults, datagram size
limit errors, and protocol_version inapplicability. Also adds QWP/UDP
row to the protocol version auto-detection table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 25, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds QWP/UDP and QWP/WebSocket support across C bindings, Cython runtime and stubs, Buffer factory refactor, Sender QWP/WebSocket runtime APIs, docs/examples, manifest, CI/build updates, and extensive unit/system tests.

Changes

QWP Protocol Core Implementation

Layer / File(s) Summary
C bindings and protocol declarations
src/questdb/line_sender.pxd
Expose QWP enums, server-rejection error, QWPWS progress/error structs, QWP buffer constructors, option setters, protocol query, and QWPWS operational APIs.
Type stubs and public API contracts
src/questdb/ingress.pyi
Add Protocol QwpUdp/QwpWs/QwpWss, QWP/WebSocket diagnostic/progress types, IngressServerRejectionError, Buffer factory declarations, and Sender config/method declarations.
Error conversion and protocol extension
src/questdb/ingress.pyx
Map C server_rejection to IngressErrorCode.ServerRejection; add IngressError.qwp_ws_error and IngressServerRejectionError; extend Protocol and TLS behavior.
Buffer factory and init guards
src/questdb/ingress.pyx (Buffer)
Introduce Buffer.ilp()/Buffer.qwp(), deprecate direct Buffer() construction, add cinit/_check_impl guards and update buffer operations and examples.
Auto-flush parsing and config parsing
src/questdb/ingress.pyx
Accept max_datagram_size for auto-flush defaults and add typed parsing for tls_roots_password, retry_max_backoff_millis, qwp_ws_progress; add conf_str_value.
Sender configuration parsing and wiring
src/questdb/ingress.pyx (Sender)
Extend Sender signatures and from_conf/from_env to accept max_datagram_size, multicast_ttl, tls_roots_password, retry_max_backoff, qwp_ws_progress, qwp_ws_error_handler; validate and wire options and register C trampoline.
Sender runtime APIs and close/drain
src/questdb/ingress.pyx (Sender runtime)
Add FSN-returning flushs, published/acked FSN queries, await_acked_fsn, drive_once, poll_qwp_ws_error, qwp_ws_errors_dropped, close_drain; update flush/close semantics and buffer selection/validation.

Documentation and Examples

Layer / File(s) Summary
Configuration reference and TLS notes
docs/conf.rst
Document qwpudp protocol and default port 9007, max_datagram_size, multicast_ttl, qwpwss TLS roots/password notes, auto-flush differences, and mark protocol_version N/A for QWP/UDP; refine HTTP retry docs.
Sender usage and protocol guidance
docs/sender.rst
Clarify ILP vs QWP trade-offs; document Buffer.ilp/Buffer.qwp and Sender.new_buffer; add QWP/UDP subsection and protocol/version table updates.
Installation and verification examples
docs/installation.rst
Update verification snippets to use Buffer.ilp() and bytes(buf) checks; adjust Pandas example text.
QWP/UDP example and manifest
examples/qwp_udp.py, examples.manifest.yaml, docs/examples.rst
Add runnable QWP/UDP example script, manifest entry, and docs literalinclude reference.

Unit and Integration Tests

Layer / File(s) Summary
QWP WebSocket API and Buffer unit tests
test/test.py
Add TestQwpWebSocketApi for enums, structured diagnostics, server rejection subtype, Sender.from_conf checks, and TestUninitializedBuffer asserting Buffer operations fail when uninitialized.
QWP/UDP system tests
test/system_test.py
Add extensive QWP/UDP and QWP/WebSocket integration tests and fixture wiring covering parsing, buffering, auto-flush, datagram splitting, types, timestamps, arrays/decimals, concurrency, failover, and many edge cases.

Maintenance

Layer / File(s) Summary
Subproject and build flags
c-questdb-client, setup.py
Bump c-questdb-client submodule commit reference and add Cargo feature insecure-skip-verify during initial Rust build.
CI pipeline updates
ci/cibuildwheel.yaml, ci/run_tests_pipeline.yaml
Adjust Windows AZP env filtering and make QuestDB master job clone/update submodule and resolve/publish JDK 25 JAVA_HOME; update test job env usage.

Sequence Diagram

sequenceDiagram
  participant ClientPy as Python Sender API
  participant IngressPy as ingress.pyx runtime
  participant LineSenderC as line_sender C lib
  participant Trampoline as C trampoline
  ClientPy->>IngressPy: Sender.flush / publish / flush_and_get_fsn
  IngressPy->>LineSenderC: call C publish/flush APIs
  LineSenderC->>Trampoline: emit QWPWS error/event view (if any)
  Trampoline->>IngressPy: invoke Python error handler callback
  IngressPy->>ClientPy: expose diagnostics via poll_qwp_ws_error / raise IngressServerRejectionError
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested reviewers

  • amunra
  • bluestreak01

🐰 New packets dance across the net,
A datagram and websocket duet,
Factories spawn buffers bright,
Structured errors tell the plight,
Rabbit nods — the client’s set!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: QWP ingestion protocol' directly matches the main objective of adding comprehensive QWP/UDP and QWP/WebSocket protocol support to the ingestion system, as evidenced by extensive changes across protocol enums, Sender APIs, documentation, examples, and extensive test coverage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jh_experiment_new_ilp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
examples/qwp_udp.py (1)

10-14: 💤 Low value

Consider noting that 1400 is the default max_datagram_size.

The max_datagram_size=1400 parameter is set to its default value. While being explicit in examples is fine, you might consider adding a comment to note this is the default, helping users understand they can omit it or adjust it as needed.

📝 Optional comment addition
         with Sender(
                 Protocol.QwpUdp,
                 host,
                 port,
-                max_datagram_size=1400) as sender:
+                max_datagram_size=1400) as sender:  # default is 1400
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/qwp_udp.py` around lines 10 - 14, The example explicitly passes
max_datagram_size=1400 which is the default; update the Sender usage (the call
to Sender with Protocol.QwpUdp) to either remove the parameter or add a short
inline comment (next to max_datagram_size=1400) indicating "1400 is the default"
so readers know it can be omitted or tuned as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@examples/qwp_udp.py`:
- Around line 10-14: The example explicitly passes max_datagram_size=1400 which
is the default; update the Sender usage (the call to Sender with
Protocol.QwpUdp) to either remove the parameter or add a short inline comment
(next to max_datagram_size=1400) indicating "1400 is the default" so readers
know it can be omitted or tuned as needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1faad2b-3030-485c-bb29-bf483b70cb9c

📥 Commits

Reviewing files that changed from the base of the PR and between a523b3a and e8eeb69.

📒 Files selected for processing (12)
  • c-questdb-client
  • docs/conf.rst
  • docs/examples.rst
  • docs/installation.rst
  • docs/sender.rst
  • examples.manifest.yaml
  • examples/qwp_udp.py
  • src/questdb/ingress.pyi
  • src/questdb/ingress.pyx
  • src/questdb/line_sender.pxd
  • test/system_test.py
  • test/test.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/questdb/ingress.pyx (2)

2755-2760: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make sender setup rollback-safe when buffer reservation fails.

reserve_buffer() can now raise from _new_buffer_for_sender(). If that happens after line_sender_build() succeeds, establish() leaves _impl and _opts live on the same Sender, so retrying establish() can leak or double-initialize native state.

Suggested fix
         if self._impl == NULL:
             raise c_err_to_py(err)

+        line_sender_opts_free(self._opts)
+        self._opts = NULL
         if self._buffer is None:
-            self._buffer = self._new_buffer_for_sender()
-
-        line_sender_opts_free(self._opts)
-        self._opts = NULL
+            try:
+                self._buffer = self._new_buffer_for_sender()
+            except Exception:
+                line_sender_close(self._impl)
+                self._impl = NULL
+                raise

2214-2218: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject unknown qwp_ws_progress values explicitly.

QwpWsProgress.parse() returns None for an unmatched tag, so this turns bad user input into an AttributeError on .c_value instead of a config error.

Suggested fix
         if qwp_ws_progress is not None:
-            c_qwp_ws_progress = QwpWsProgress.parse(qwp_ws_progress).c_value
+            progress = QwpWsProgress.parse(qwp_ws_progress)
+            if progress is None:
+                raise IngressError(
+                    IngressErrorCode.ConfigError,
+                    f'"qwp_ws_progress" has invalid value: {qwp_ws_progress!r}')
+            c_qwp_ws_progress = progress.c_value
             if not line_sender_opts_qwpws_progress(
                     self._opts, c_qwp_ws_progress, &err):
                 raise c_err_to_py(err)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/questdb/ingress.pyx` around lines 2214 - 2218,
QwpWsProgress.parse(qwp_ws_progress) can return None for unknown tags, which
currently leads to an AttributeError when accessing .c_value; update the logic
around QwpWsProgress.parse(...) in ingress.pyx to check the parse result for
None, and if None raise a clear config/validation error (rather than accessing
.c_value), otherwise assign c_qwp_ws_progress = parsed.c_value and call
line_sender_opts_qwpws_progress(self._opts, c_qwp_ws_progress, &err) as before;
reference QwpWsProgress.parse, c_qwp_ws_progress,
line_sender_opts_qwpws_progress, and c_err_to_py when implementing the explicit
None check and error raise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@setup.py`:
- Line 149: The build currently unconditionally appends the Rust feature
"insecure-skip-verify" to cargo_args; change this so "insecure-skip-verify" is
only added when an explicit opt-in environment variable is set (e.g.,
RUST_INSECURE_SKIP_VERIFY or ALLOW_INSECURE_SKIP_VERIFY), otherwise append only
"confstr-ffi". Locate the cargo_args assembly in setup.py where the line adds
['--features', 'confstr-ffi,insecure-skip-verify'] and replace it with logic
that conditionally includes "insecure-skip-verify" based on os.environ; also
document the env var in the packaging README. After that, inspect the Rust
workspace (search for the feature name "insecure-skip-verify" in the Rust
submodules/crates) to confirm where the feature is defined/used and ensure it
does not alter TLS verification for default release builds when the env opt-in
is not set.

In `@src/questdb/ingress.pyx`:
- Around line 2352-2366: The code currently treats bool as an int for
retry_max_backoff, so True/False are accepted; update the type checks in the
retry_max_backoff handling (symbols: retry_max_backoff,
line_sender_opts_retry_max_backoff, cp_timedelta, _timedelta_to_millis,
c_err_to_py, _fqn) to explicitly exclude bools—for example, change the int
branch to require isinstance(retry_max_backoff, int) and not
isinstance(retry_max_backoff, bool) (or check type(...) is int), and ensure bool
values fall through to the TypeError branch so passing True/False raises a
TypeError with the same message.

---

Outside diff comments:
In `@src/questdb/ingress.pyx`:
- Around line 2214-2218: QwpWsProgress.parse(qwp_ws_progress) can return None
for unknown tags, which currently leads to an AttributeError when accessing
.c_value; update the logic around QwpWsProgress.parse(...) in ingress.pyx to
check the parse result for None, and if None raise a clear config/validation
error (rather than accessing .c_value), otherwise assign c_qwp_ws_progress =
parsed.c_value and call line_sender_opts_qwpws_progress(self._opts,
c_qwp_ws_progress, &err) as before; reference QwpWsProgress.parse,
c_qwp_ws_progress, line_sender_opts_qwpws_progress, and c_err_to_py when
implementing the explicit None check and error raise.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52ba2082-709e-4246-808f-df47556aa5e5

📥 Commits

Reviewing files that changed from the base of the PR and between e8eeb69 and 34d2d88.

📒 Files selected for processing (9)
  • c-questdb-client
  • docs/conf.rst
  • docs/sender.rst
  • setup.py
  • src/questdb/ingress.pyi
  • src/questdb/ingress.pyx
  • src/questdb/line_sender.pxd
  • test/system_test.py
  • test/test.py
✅ Files skipped from review due to trivial changes (2)
  • c-questdb-client
  • docs/sender.rst

Comment thread setup.py Outdated
del env['CXX']
subprocess.check_call(
cargo_args + ['--features', 'confstr-ffi'],
cargo_args + ['--features', 'confstr-ffi,insecure-skip-verify'],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Locate cargo manifests in the Rust submodule.
fd -a '^Cargo.toml$' c-questdb-client

# 2) Show feature definitions/usages related to insecure TLS behavior.
rg -n --no-heading 'insecure-skip-verify|confstr-ffi|tls_verify|unsafe_off|feature' c-questdb-client

Repository: questdb/py-questdb-client

Length of output: 51


Gate insecure-skip-verify behind an explicit opt-in (not default-on)

At setup.py:149, release builds unconditionally add Cargo features confstr-ffi,insecure-skip-verify to cargo_args, which can weaken TLS certificate verification behavior in the Rust FFI for all published wheels/sdist. Make insecure-skip-verify opt-in via an explicit env var and omit it by default.

cargo_args + ['--features', 'confstr-ffi,insecure-skip-verify'],

Requested check (Rust feature wiring): the earlier c-questdb-client Cargo feature search returned no matches in this environment; after initializing the Rust submodule, inspect where insecure-skip-verify is defined/used and confirm it doesn’t change cert verification semantics for release builds.

🧰 Tools
🪛 Ruff (0.15.13)

[warning] 149-149: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@setup.py` at line 149, The build currently unconditionally appends the Rust
feature "insecure-skip-verify" to cargo_args; change this so
"insecure-skip-verify" is only added when an explicit opt-in environment
variable is set (e.g., RUST_INSECURE_SKIP_VERIFY or ALLOW_INSECURE_SKIP_VERIFY),
otherwise append only "confstr-ffi". Locate the cargo_args assembly in setup.py
where the line adds ['--features', 'confstr-ffi,insecure-skip-verify'] and
replace it with logic that conditionally includes "insecure-skip-verify" based
on os.environ; also document the env var in the packaging README. After that,
inspect the Rust workspace (search for the feature name "insecure-skip-verify"
in the Rust submodules/crates) to confirm where the feature is defined/used and
ensure it does not alter TLS verification for default release builds when the
env opt-in is not set.

Comment thread src/questdb/ingress.pyx

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/system_test.py (2)

796-814: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document Sender.new_buffer() lifecycle preconditions in public stubs/docs

Sender.new_buffer() raises IngressError if called before Sender.establish() ("new_buffer() can't be called before establish()") or after Sender.close() ("new_buffer() can't be called: Sender is closed."), per test/system_test.py (lines 796-814). src/questdb/ingress.pyi (and docs/sender.rst) still describe new_buffer() without these required preconditions—update the public contract accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/system_test.py` around lines 796 - 814, The public type stub and docs
don't state that Sender.new_buffer() may raise IngressError when called before
Sender.establish() or after Sender.close(); update src/questdb/ingress.pyi and
docs/sender.rst to document these lifecycle preconditions: indicate that
new_buffer() requires prior Sender.establish() and will raise qi.IngressError
with message "new_buffer() can't be called before establish()" if not
established, and will raise qi.IngressError with message "new_buffer() can't be
called: Sender is closed" if the Sender has been closed; reference the
Sender.new_buffer, Sender.establish, Sender.close methods and the IngressError
exception in the public contract.

1043-1058: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix UDP auto-flush system tests to avoid false positives from context-manager flush

In test/system_test.py, the auto-flush tests (1043-1058, 1060-1073, 1344-1361, 1592-1607) use with self._mk_qwpudp_sender(...) as sender:. On normal with-exit the sender context manager calls close(flush=True), which flushes any buffered rows even if the specific byte/row/interval/rapid-fire trigger never fires—so these tests can pass while the trigger logic is broken.

Use a manual sender lifetime instead: create the sender without the context manager, call sender.establish(), and in finally call sender.close(flush=False) so buffered data is not flushed on exit; keep/retain the existing server-side assertions (e.g., retry_check_table(..., min_rows=10) and count checks) to validate that only the intended auto-flush path publishes rows.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/system_test.py` around lines 1043 - 1058, The auto-flush UDP tests
currently use the context manager from _mk_qwpudp_sender which calls
close(flush=True) on exit and masks failures; replace the with
self._mk_qwpudp_sender(...) as sender: usage by creating the sender manually,
call sender.establish() before sending rows, and ensure you close the sender in
a finally block with sender.close(flush=False) so the context exit doesn't flush
buffered rows; keep the existing server-side assertions (e.g.,
qdb_plain.retry_check_table(..., min_rows=10) and resp['count'] checks)
unchanged to validate that only the intended auto-flush trigger causes
publishes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/system_test.py`:
- Around line 165-168: The _unused_tcp_port() function currently returns a port
after closing the socket, causing a TOCTOU race; instead bind and keep the
socket reserved and return the bound socket (and/or the port) so callers can use
the reserved endpoint and close the socket only when failover is set up; update
any call sites (including the other occurrence around lines 384-386) to accept
and later close the returned socket once the sender/receiver are using the port.

---

Outside diff comments:
In `@test/system_test.py`:
- Around line 796-814: The public type stub and docs don't state that
Sender.new_buffer() may raise IngressError when called before Sender.establish()
or after Sender.close(); update src/questdb/ingress.pyi and docs/sender.rst to
document these lifecycle preconditions: indicate that new_buffer() requires
prior Sender.establish() and will raise qi.IngressError with message
"new_buffer() can't be called before establish()" if not established, and will
raise qi.IngressError with message "new_buffer() can't be called: Sender is
closed" if the Sender has been closed; reference the Sender.new_buffer,
Sender.establish, Sender.close methods and the IngressError exception in the
public contract.
- Around line 1043-1058: The auto-flush UDP tests currently use the context
manager from _mk_qwpudp_sender which calls close(flush=True) on exit and masks
failures; replace the with self._mk_qwpudp_sender(...) as sender: usage by
creating the sender manually, call sender.establish() before sending rows, and
ensure you close the sender in a finally block with sender.close(flush=False) so
the context exit doesn't flush buffered rows; keep the existing server-side
assertions (e.g., qdb_plain.retry_check_table(..., min_rows=10) and
resp['count'] checks) unchanged to validate that only the intended auto-flush
trigger causes publishes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5f89923c-db2a-4425-bac4-1da000a92205

📥 Commits

Reviewing files that changed from the base of the PR and between 34d2d88 and da27059.

📒 Files selected for processing (1)
  • test/system_test.py

Comment thread test/system_test.py
Comment on lines +165 to +168
def _unused_tcp_port():
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.bind(('127.0.0.1', 0))
return sock.getsockname()[1]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reserve the failover port instead of probing and releasing it.

This is a TOCTOU race: _unused_tcp_port() closes the socket before the sender uses the endpoint, so another process can claim that port and make the failover path nondeterministic.

Proposed fix
+import contextlib
 import socket
 ...
-    `@staticmethod`
-    def _unused_tcp_port():
-        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
-            sock.bind(('127.0.0.1', 0))
-            return sock.getsockname()[1]
+    `@staticmethod`
+    `@contextlib.contextmanager`
+    def _reserved_tcp_port():
+        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
+            sock.bind(('127.0.0.1', 0))
+            yield sock.getsockname()[1]
 ...
-        endpoints = [
-            (self.qdb_plain.host, self._unused_tcp_port()),
-            (self.qdb_plain.host, self.qdb_plain.http_server_port)]
+        with self._reserved_tcp_port() as dead_port:
+            endpoints = [
+                (self.qdb_plain.host, dead_port),
+                (self.qdb_plain.host, self.qdb_plain.http_server_port),
+            ]

Also applies to: 384-386

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/system_test.py` around lines 165 - 168, The _unused_tcp_port() function
currently returns a port after closing the socket, causing a TOCTOU race;
instead bind and keep the socket reserved and return the bound socket (and/or
the port) so callers can use the reserved endpoint and close the socket only
when failover is set up; update any call sites (including the other occurrence
around lines 384-386) to accept and later close the returned socket once the
sender/receiver are using the port.

jerrinot and others added 10 commits May 26, 2026 11:10
Introduce a new Client class wrapping the c-questdb-client questdb_db
pool. Client.dataframe() emits via column_sender_chunk_* adapters for
the v1 supported subset: fixed table name, NumPy int64/float64, NumPy
datetime64[ns/us] field and designated timestamp columns, pandas
categorical symbols, Arrow UTF-8 string fields, and large_string
columns (cast to UTF-8 before publication).

- Extract dataframe_plan_t in dataframe.pxi so row and columnar
  emitters share argument resolution and dtype planning.
- Add column_sender FFI declarations and UnsupportedDataFrameShapeError
  carrying structured per-column rejection details before publication.
- Schema-aware chunk planning with 8-row alignment when validity is
  present; mandatory column_sender_sync(ok) before return, with
  sync-and-retry for deferred flushes hitting the in-flight reserve.
- Benchmark harness (test/benchmark_pandas_columnar.py) covering
  Layer 1 plan/populate, Layer 2 client-ACK with pool-reuse check,
  and Layer 3 against a real QuestDB.
- Local ACK fixture (test/qwp_ws_ack_server.py) and a fixture-backed
  Layer 3 runner that spins up a QuestDB jar and verifies row counts.
- Deterministic seed-controlled fuzz coverage
  (test/test_client_dataframe_fuzz.py) exercising the planner and
  emitter via the ACK fixture; reproduce with QDB_CLIENT_FUZZ_SEED /
  QDB_CLIENT_FUZZ_ITER_SEED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add value-edge and multi-chunk coverage to the seeded fuzz:

- Numeric generators emit IEEE-754 specials (NaN, ±Inf, ±0.0,
  subnormals) and int64 boundary values (INT64_MIN/MAX) with 5%
  probability each; string generator emits empty strings with 5%
  probability to exercise zero-length offset slices.
- test_multi_chunk_with_nulls: 100 003 rows of nullable categorical +
  int64 forces multi-chunk emission across the 8-row validity-bitmap
  alignment boundary.
- test_high_cardinality_symbol_i16: 200 categories forces pandas to
  allocate i16 codes, exercising the column_sender_chunk_symbol_dict_i16
  path that random fuzz rarely reaches.
- test_wide_frame_multi_chunk: 12 field columns + 64 001 rows hits the
  planner's n_cols>8 → 64 000 cap branch.
- test_sequential_client_lifecycle: 30 open/use/close cycles flush any
  close-then-reopen lifecycle leaks; verifies accepted_connections
  matches cycle count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the Cython pxd and Client.dataframe() implementation to track
the c-questdb-client submodule's rename of the borrowed pool handle
from column_sender to qwpws_conn. No behaviour change at the Python
level; Client.from_conf / .dataframe / .close / .reap_idle keep their
signatures.

Submodule bump: c-questdb-client jh_conn_pool_refactor (7740b7a)
covers the C header, Rust FFI shim, and ABI doc.

See plan-conn-pool-and-writers.md for the multi-step plan this is
part of (Step 1 of 5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the pxd declaration for the new Arrow appender. The
arrow_c_data_interface module already provides the ArrowArray /
ArrowSchema struct definitions, so the new entry point can take the
same types pandas / pyarrow / polars already expose via _export_to_c.

No call site wires this in yet — Client.dataframe still uses the
per-type appenders. Wiring lands in Step 2c.

Submodule bump: c-questdb-client jh_conn_pool_refactor (632c647)
brings the new FFI surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c-questdb-client jh_conn_pool_refactor (6c53ea7) adds LargeUtf8
(Arrow 'U') support to column_sender_chunk_append_arrow_column. No
Python-side change yet — Step 2c wires it in via the planner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Client.dataframe()'s varchar (string) and symbol (dictionary)
columns now dispatch through column_sender_chunk_append_arrow_column
instead of the per-type column_sender_chunk_column_varchar and
symbol_dict_i* calls. Numeric / timestamp columns stay on the
per-type path — they were already direct-write to wire (see
Open Q1) and the Arrow appender adds no per-row benefit there.

What this changes:
  - The Cython side no longer manually slices Arrow offsets / bytes
    / dict-codes for chunked emission; the Arrow appender does
    that slicing in Rust, with one dispatch step per format.
  - Categorical i8/i16/i32 dispatch lives in Rust now, not Cython.
  - The v1 validator accepts both col_source_str_utf8_arrow and
    col_source_str_lrg_utf8_arrow — large_string flows through the
    Arrow appender's `U` path natively. (The legacy row path still
    needs the planner-shared large_string → utf8 cast, so it stays;
    the Rust appender's `U` support is latent capability today but
    used when the cast is removed in a future change.)

No public Python API change. 929 tests pass; 4 fixed seeds × 200
fuzz iters pass.

Submodule bump: c-questdb-client jh_conn_pool_refactor (0650c40)
adds the row_offset / row_count parameters to the Arrow appender.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Update plan-conn-pool-and-writers.md with Step 1, 2a, 2b, 2c
  completion (with submodule + parent SHAs).
- Correct the Step 1 claim that it resolves the round-3 dirty-sender
  concern — it doesn't; the rename is cosmetic at that layer. Add
  an explicit caveat block describing the open follow-up.
- Step 2 expansion: enumerate the three sub-steps and the known
  gaps surfaced during implementation (dead `lrg` validator arm,
  unreachable `U` format path, two-place 8-row alignment).
- Step 3 reframed per Q1: "no perf win, only narrower-dtype /
  stride / endian convenience" + the open widening-policy decisions
  to make before implementing.

Submodule bump: c-questdb-client jh_conn_pool_refactor (f35123d)
applies the Arrow mirror-type and naming polish.

929 Python tests + 836 Rust unit tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jerrinot and others added 30 commits May 29, 2026 11:22
`Client.dataframe` now accepts `pa.int8`, `pa.int16`, `pa.int32`,
and `pa.float32` Arrow columns and sends them as the corresponding
narrow QWP wire types (BYTE / SHORT / INT / FLOAT) instead of
silently failing at the column-path serializer. Pre-PR these
sources were classified by the shared planner but had no dispatch
arm on the column-QWP side — they failed at runtime with
"Unsupported columnar int/float source".

Architecture:

- New `col_target_t` narrow targets used *only* by column-QWP;
  not added to `_FIELD_TARGETS`, so the shared resolver still
  picks the wide targets (`col_target_column_i64` /
  `col_target_column_f64`) for row-ILP. Row-ILP serializer is
  untouched.
- `_dataframe_columnar_rewrite_to_narrow_arrow_targets` runs
  after validation as a pure post-processing pass: rewrites the
  (target, dispatch_code) pair for the four Arrow narrow sources.
- `_dataframe_columnar_finalize_plan` bundles
  validate-then-retarget so the three column-QWP entry points
  call one helper instead of orchestrating both steps.

Validator's i64/f64 source allowlists extended to accept the new
Arrow sources — they're transitionally classified to the wide
target by the shared planner, then narrowed by the retarget
step. The validator's catch-all stays clean (no per-narrow-target
pass-through branch).

Tests: 9 system tests in `TestColumnIngressNarrowTypes` covering
round-trip (i8/i16/i32/f32 against BYTE/SHORT/INT/FLOAT target
columns), server-side widening (i8 → LONG, f32 → DOUBLE),
INT32_MIN sentinel collision, SHORT non-nullable contract, and
`pa.uint8` rejection pin.

Doc: `pandas-paths-comparison.md` gains a "Closing the
type-support gap" section pinning the design — strict Arrow
mirror client-side, server-side coercion for everything else —
plus the categorised work breakdown and the four verification
items.

DATE / `pa.timestamp('ms')` deferred to a follow-up; it touches
the shared row-ILP classifier and needs row-ILP serializer
support to ship safely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_dataframe_resolve_target` now takes the field-targets tuple as a
parameter; each protocol passes its own ordering. Row-ILP gets
`_FIELD_TARGETS_ROW` (today's behaviour, wide targets only).
Column-QWP gets `_FIELD_TARGETS_QWP` with the narrow targets
listed in front of their wide counterparts, so the resolver picks
BYTE / SHORT / INT / FLOAT directly for Arrow narrow sources
instead of widening to LONG / DOUBLE.

This deletes the post-resolution scaffold added in the previous
PR — the rewrite step exists only because the resolver was making
the wrong target choice for column-QWP. Once the resolver gets
the path-specific tuple, the right target falls out on the first
pass, and the rewrite + the finalize bundle + the validator
allowlist extensions all become unnecessary.

Net: -38 lines, no behaviour change. Prerequisite for PR 2
(UUID) — UUID's `FixedSizeBinary(16)` source has no existing
wide target whose source-set can transitionally accept it, so the
rewrite pattern would not have extended to that case. Listing
`col_target_column_uuid` directly in `_FIELD_TARGETS_QWP` is the
natural way to add it.

- `_FIELD_TARGETS` → `_FIELD_TARGETS_ROW` + `_FIELD_TARGETS_QWP`
- Parameter plumbed through `_dataframe_resolve_target`,
  `_dataframe_resolve_cols_target_name_and_dc`,
  `_dataframe_resolve_args`, `_dataframe_plan_build`. Five
  callsites updated (2 row-ILP, 3 column-QWP).
- Deleted `_dataframe_columnar_finalize_plan` and
  `_dataframe_columnar_rewrite_to_narrow_arrow_targets`.
- Validator's i64 / f64 source allowlists reverted to their
  pre-PR-1 shape; explicit narrow-target branch added so the
  catch-all doesn't fire on the new targets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Client.dataframe` now accepts `pa.fixed_size_binary(16)` columns
and the `arrow.uuid` extension type wrapping the same storage,
forwarding the bytes to QuestDB's UUID wire encoding via
`column_sender_chunk_column_uuid`.

Path-asymmetry contract: column-QWP routes `fsb16_arrow` through
the new `col_target_column_uuid` (listed in `_FIELD_TARGETS_QWP`);
row-ILP's `_FIELD_TARGETS_ROW` deliberately omits it, so the
shared resolver fails to map `fsb16_arrow` to a target when
`Sender.dataframe` is called with a UUID column — pinned by the
`test_fsb16_rejected_by_row_ilp` test.

Byte order: bytes are passed through verbatim in QuestDB's UUID
wire layout ("bytes 0..8 lo half LE, bytes 8..16 hi half LE", per
`column_sender.h:303`). This matches the convention shared across
the c-questdb-client family — Polars ingress
(`questdb-rs/src/ingress/polars.rs`, downcast to
`FixedSizeBinaryArray` + `extend_from_slice`) and egress
(unchanged FSB storage emitted with the `arrow.uuid` metadata
label) do the same. Users who want `uuid.UUID.bytes` (RFC 4122
big-endian) round-trip convert at their boundary.

Source detection: extended `_dataframe_series_resolve_arrow` to
unwrap `pa.BaseExtensionType` and to recognise
`FixedSizeBinary(16)` as `col_source_fsb16_arrow`. Other
fixed-width sizes (e.g. `FSB(8)`, `FSB(32)`) are not routed by
this PR and surface as the existing "Unsupported arrow type"
rejection — LONG256 (FSB(32)) lands in PR 3.

Tests: 7 new system tests in `TestColumnIngressNarrowTypes`:
canonical FSB(16) round-trip, `arrow.uuid` extension-type
round-trip (skip if pyarrow lacks `pa.uuid()`), null-bitmap
round-trip, server-side string→UUID coercion, invalid-UUID-
string rejection, row-ILP UUID rejection, and `FSB(k != 16)`
rejection. Includes a `_uuid_to_wire` helper for tests and
docstring examples; consider exposing this publicly in a
follow-up if user friction emerges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two more QuestDB-extension types to `Client.dataframe`,
following the UUID pattern from PR 2:

- IPV4: `pa.uint32()` → `column_sender_chunk_column_ipv4`. The
  Arrow source overlaps with the existing
  `col_target_column_i64` (which widens any uint to LONG for
  row-ILP). `_FIELD_TARGETS_QWP` lists IPV4 before i64 so the
  column-QWP resolver picks IPV4 first; row-ILP's
  `_FIELD_TARGETS_ROW` doesn't list IPV4, so its existing widening
  behaviour is preserved. Per the strict-mirror policy:
  `pa.uint32()` is unambiguously IPV4 on column-QWP — counts that
  the user wants stored as LONG must cast to `pa.int64()` before
  ingest. Pin this with `test_pa_uint32_is_routed_to_ipv4_not_long`.
- LONG256: `pa.fixed_size_binary(32)` →
  `column_sender_chunk_column_long256`. No source-overlap; no
  ordering subtlety. Bytes forwarded verbatim in the QuestDB wire
  layout ("four LE 64-bit limbs, least-significant first"),
  matching the same opaque-bytes convention as UUID + Polars (see
  PR #150).

GEOHASH deferred: its Arrow canonical mirror is a sized int
(int8/16/32/64 depending on bit precision), which conflicts with
the narrow-int routing from PR 1. Strict-mirror dispatch has no
way to distinguish a `pa.int64()` LONG column from a `pa.int64()`
GEOHASH column without consulting Arrow field metadata — and the
agreed policy is no metadata-based dispatch. GEOHASH will need
either a different policy decision or string-coercion-only
support; revisit before declaring the type-support gap closed.

Also unlocks `pa.uint32()` on row-ILP (Arrow detection was
gated at `_dataframe_series_resolve_arrow` before this PR, even
though the row-ILP serializer at
`_dataframe_serialize_cell_column_i64__u32_arrow` already
existed). Side effect of adding Type_UINT32 detection; the
serializer's behaviour is unchanged.

Tests: 8 new system tests in `TestColumnIngressNarrowTypes` —
IPV4 round-trip, IPV4-string-coercion-currently-unsupported,
IPV4 server-rejection on invalid string, IPV4 routes-to-ipv4-not-
long pin, LONG256 round-trip, LONG256 with nulls, LONG256 row-ILP
rejection. The IPV4 server-coercion test pins a current QuestDB
limitation (no VARCHAR → IPV4 coercion at insert time); flip it
when QuestDB adds coercion support.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Route Client.dataframe Arrow publishes through one QWP buffer per call, add benchmark counters, and remove the manual preflight from real-client measurements.

Update tests for Rust Arrow timestamp validation, multi-chunk buffer reuse, and expected fuzz rejections.
Let Client.dataframe use the capsule/Rust Arrow route for pandas frames whose dtypes are already Arrow-backed, while keeping object columns and non-pyarrow string columns on the manual planner. That fixes pandas ArrowDtype UInt64, Float16, and timestamp-unit e2e ingestion without losing the Python-side bad-cell validation that must fail before borrowing a connection.

Teach capsule symbol resolution about pandas Arrow/string[pyarrow] dtypes and slice pandas frames with iloc so max_rows_per_batch still bounds each Arrow publish. Timestamp-only pandas frames deliberately fall back to the manual planner to preserve the existing UnsupportedDataFrameShapeError API and no-publication contract.

Update the Rust submodule to accept plain FSB16 as UUID and reject null timestamp fields, matching the Python and real-server contracts. The tradeoff is that FSB16 is now treated as UUID in Arrow ingestion rather than generic fixed-size binary; supporting a separate LONG128/binary policy would need an explicit future override or metadata distinction.

Also run the narrow-type e2e class against QuestDB 9.4.1+ instead of only local QDB_REPO_PATH builds, so the release-server QWP/WebSocket path is covered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants