Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
7ab41a1
Add QWP/UDP system test for f64 arrays
jerrinot Apr 14, 2026
ac5573d
test: add QWP/UDP e2e tests for timestamp, array, decimal, and string…
jerrinot Apr 14, 2026
0f65b5c
c client sub
jerrinot Apr 14, 2026
92bd99b
test: add QWP/UDP auto-flush, edge-case, and stress tests
jerrinot Apr 14, 2026
1550711
test: add QWP/UDP interval flush, datagram splitting, and more e2e tests
jerrinot Apr 14, 2026
bd6ba73
test: add QWP/UDP PyArrow decimal DataFrame e2e test
jerrinot Apr 14, 2026
48f6642
test: add QWP/UDP decimal edge-case tests
jerrinot Apr 14, 2026
0f9ec3c
test: add QWP/UDP QA edge-case tests
jerrinot Apr 14, 2026
f65a186
fix: correct f64_array test timestamp type assertion
jerrinot Apr 14, 2026
a139d18
docs: mention QWP/UDP in sender overview, tips, and protocol version
jerrinot Apr 15, 2026
fbf17de
docs: add dedicated QWP/UDP section with tradeoffs and gotchas
jerrinot Apr 15, 2026
4dc8144
Expose QWP WebSocket controls in Python
jerrinot May 4, 2026
e8eeb69
Expose QWP WebSocket rejection callbacks
jerrinot May 11, 2026
f3c22d9
Upgrade QWP FFI bindings
jerrinot May 25, 2026
34d2d88
Merge remote-tracking branch 'origin/main' into jh_experiment_new_ilp
jerrinot May 25, 2026
da27059
Add QWP system test coverage
jerrinot May 25, 2026
76f107f
Fix CI compatibility with current deps and QuestDB master
jerrinot May 26, 2026
b6c8b50
fix: reject bool sender duration options
jerrinot May 26, 2026
988fae5
Pin c-questdb-client column sender branch
jerrinot May 26, 2026
735aa96
Add Client.dataframe() pooled columnar ingest path
jerrinot May 27, 2026
39c8f51
Extend Client.dataframe() fuzz coverage
jerrinot May 27, 2026
9854f5e
Step 1: bind Client to qwpws_conn FFI rename
jerrinot May 27, 2026
66ba477
Step 2a: bind column_sender_chunk_append_arrow_column in Cython
jerrinot May 27, 2026
8c04c63
Step 2b: bump submodule for LargeUtf8 support
jerrinot May 27, 2026
ff0c909
Step 2c: route Arrow-backed columns through the new appender
jerrinot May 27, 2026
a02c280
Self-review fixes: plan doc + submodule bump
jerrinot May 27, 2026
0d3b1d5
Step 4-A: support str_pyobj (object-dtype string columns)
jerrinot May 27, 2026
e43783e
Step 4-BCD: support int/float/bool PyObject column sources
jerrinot May 27, 2026
10dba21
Review fixes: pyobj null-alignment + prebuilt NULL guards
jerrinot May 27, 2026
d420d79
Step 3: route narrow NumPy dtypes through column_numpy
jerrinot May 27, 2026
64cb920
Round-3 must_close fix: drop conn on mid-call error
jerrinot May 27, 2026
860c105
Plan doc completion + NaN-as-null docs
jerrinot May 27, 2026
392e05f
Pin c-questdb-client merge of arrow_polars (PR #150)
jerrinot May 28, 2026
89a2f43
Egress: Client.query() → pandas / pyarrow
jerrinot May 28, 2026
653caea
Egress cleanup: numpy_nullable, dead code, error mapping, docs
jerrinot May 28, 2026
9db3325
Egress null contract: tests + plan-doc precision
jerrinot May 28, 2026
e0fece8
Columnar v1 ingress: accept tz-aware timestamps
jerrinot May 28, 2026
fc5a027
update FFI sub
jerrinot May 28, 2026
ca106c4
Wire `Client.query` through the `questdb_db` reader pool
jerrinot May 28, 2026
8ed4d5d
dataframe ingress/egress comparison doc
jerrinot May 28, 2026
676d82d
Column-QWP narrow Arrow primitive dispatch (i8/i16/i32/f32)
jerrinot May 29, 2026
9c1e334
Refactor: path-parameterized _FIELD_TARGETS
jerrinot May 29, 2026
5cd5cc6
Column-QWP UUID dispatch (FSB(16) + arrow.uuid extension)
jerrinot May 29, 2026
ab7d952
Column-QWP IPV4 + LONG256 dispatch (GEOHASH deferred)
jerrinot May 29, 2026
f4aa5a1
Update c-questdb-client submodule
jerrinot Jun 1, 2026
23e6de1
Align Client.dataframe UInt32 Arrow policy
jerrinot Jun 1, 2026
a277ee5
Support Arrow wide numerics in Client.dataframe
jerrinot Jun 1, 2026
04d4090
Fix Python Arrow error code enum drift
jerrinot Jun 1, 2026
be13d6b
Preserve Arrow LargeUtf8 in dataframe planner
jerrinot Jun 1, 2026
4f424a7
Prototype Rust Arrow dataframe append path
jerrinot Jun 1, 2026
ce2b92e
Add dataframe payload round-trip coverage
jerrinot Jun 1, 2026
21f7d2e
Preserve LargeUtf8 categorical dataframe symbols
jerrinot Jun 1, 2026
2a63fc7
Refresh Client.dataframe findings
jerrinot Jun 1, 2026
31fa8c8
Route Client.dataframe through Rust Arrow path
jerrinot Jun 1, 2026
0df71ea
Reuse Arrow dataframe buffers across chunks
jerrinot Jun 1, 2026
ae5c8af
Route explicit dataframe symbols through Arrow
jerrinot Jun 3, 2026
a7ae730
Pin dataframe UInt64 semantics with real e2e
jerrinot Jun 3, 2026
ed64570
Fix no-pandas CI tests
jerrinot Jun 3, 2026
4e91a5f
Test QWP against QuestDB 9.4.1
jerrinot Jun 3, 2026
350cdc0
use java 25
jerrinot Jun 3, 2026
5b997a3
Fix dataframe auto-flush flag initialization
jerrinot Jun 3, 2026
93e2e29
use new c abi for arrow and numpy column
kafka1991 Jun 5, 2026
b32aa2f
Implement reusable Arrow dataframe imports
jerrinot Jun 5, 2026
4bd3a75
Route Arrow-backed pandas dataframes through Arrow ingestion
jerrinot Jun 5, 2026
a35f967
update c-abi
kafka1991 Jun 8, 2026
bc3d6dc
remove unused code
kafka1991 Jun 8, 2026
fff5c3c
update c module and fix tests on win32
kafka1991 Jun 8, 2026
80093d5
update c module and doc string
kafka1991 Jun 8, 2026
f6a1a24
update c api and code review
kafka1991 Jun 8, 2026
6aeb1ee
update c abi
kafka1991 Jun 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion c-questdb-client
Submodule c-questdb-client updated 230 files
4 changes: 2 additions & 2 deletions ci/cibuildwheel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ stages:
cmd /c "call `"$vsPath`" && set > env_vars.txt"

Get-Content env_vars.txt | ForEach-Object {
if ($_ -match "^([^=]+?)=(.*)$" -and $matches[1] -notmatch '^(SYSTEM|AGENT|BUILD|RELEASE|VSTS|TASK|USE_|FAIL_|MSDEPLOY|AZP_75787|AZP_AGENT|AZP_ENABLE|AZURE_HTTP|COPYFILESOVERSSHV0|ENABLE_ISSUE_SOURCE_VALIDATION|MODIFY_NUMBER_OF_RETRIES_IN_ROBOCOPY|MSBUILDHELPERS_ENABLE_TELEMETRY|RETIRE_AZURERM_POWERSHELL_MODULE|ROSETTA2_WARNING|AZP_PS_ENABLE)') {
if ($_ -match "^([^=]+?)=(.*)$" -and $matches[1] -notmatch '^(SYSTEM|AGENT|BUILD|RELEASE|VSTS|TASK|USE_|FAIL_|MSDEPLOY|AZP_|AZURE_HTTP|COPYFILESOVERSSHV0|ENABLE_ISSUE_SOURCE_VALIDATION|MODIFY_NUMBER_OF_RETRIES_IN_ROBOCOPY|MSBUILDHELPERS_ENABLE_TELEMETRY|RETIRE_AZURERM_POWERSHELL_MODULE|ROSETTA2_WARNING)') {
[System.Environment]::SetEnvironmentVariable($matches[1], $matches[2], "Process")
Write-Host "##vso[task.setvariable variable=$($matches[1])]$($matches[2])"
}
Expand Down Expand Up @@ -137,7 +137,7 @@ stages:
cmd /c "call `"$vsPath`" && set > env_vars.txt"

Get-Content env_vars.txt | ForEach-Object {
if ($_ -match "^([^=]+?)=(.*)$" -and $matches[1] -notmatch '^(SYSTEM|AGENT|BUILD|RELEASE|VSTS|TASK|USE_|FAIL_|MSDEPLOY|AZP_75787|AZP_AGENT|AZP_ENABLE|AZURE_HTTP|COPYFILESOVERSSHV0|ENABLE_ISSUE_SOURCE_VALIDATION|MODIFY_NUMBER_OF_RETRIES_IN_ROBOCOPY|MSBUILDHELPERS_ENABLE_TELEMETRY|RETIRE_AZURERM_POWERSHELL_MODULE|ROSETTA2_WARNING|AZP_PS_ENABLE)') {
if ($_ -match "^([^=]+?)=(.*)$" -and $matches[1] -notmatch '^(SYSTEM|AGENT|BUILD|RELEASE|VSTS|TASK|USE_|FAIL_|MSDEPLOY|AZP_|AZURE_HTTP|COPYFILESOVERSSHV0|ENABLE_ISSUE_SOURCE_VALIDATION|MODIFY_NUMBER_OF_RETRIES_IN_ROBOCOPY|MSBUILDHELPERS_ENABLE_TELEMETRY|RETIRE_AZURERM_POWERSHELL_MODULE|ROSETTA2_WARNING)') {
[System.Environment]::SetEnvironmentVariable($matches[1], $matches[2], "Process")
Write-Host "##vso[task.setvariable variable=$($matches[1])]$($matches[2])"
}
Expand Down
25 changes: 22 additions & 3 deletions ci/run_tests_pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,23 +61,42 @@ stages:
displayName: "Build"
- script: |
git clone --depth 1 https://github.com/questdb/questdb.git
cd questdb
git submodule update --init --depth 1 java-questdb-client
displayName: git clone questdb master
condition: eq(variables.vsQuestDbMaster, true)
- bash: |
set -euo pipefail
JDK_HOME="${JAVA_HOME_25_X64:-}"
if [ -z "$JDK_HOME" ] || [ ! -x "$JDK_HOME/bin/javac" ]; then
JDK_HOME="/opt/jdk25"
sudo mkdir -p "$JDK_HOME"
curl -fsSL "https://api.adoptium.net/v3/binary/latest/25/ga/linux/x64/jdk/hotspot/normal/eclipse" |
sudo tar -xz -C "$JDK_HOME" --strip-components=1
fi
# Azure parses ##vso logging commands on both stdout and stderr.
# Keep xtrace off here so bash never emits a quoted stderr copy.
set +x
echo "##vso[task.setvariable variable=JAVA_HOME]$JDK_HOME"
echo "##vso[task.prependpath]$JDK_HOME/bin"
displayName: "Resolve JDK 25"
condition: eq(variables.vsQuestDbMaster, true)
- task: Maven@3
displayName: "Compile QuestDB master"
inputs:
mavenPOMFile: "questdb/pom.xml"
jdkVersionOption: "1.17"
javaHomeOption: "Path"
jdkDirectory: "$(JAVA_HOME)"
options: "-DskipTests -Pbuild-web-console"
condition: eq(variables.vsQuestDbMaster, true)
- script: python3 proj.py test 1
displayName: "Test vs released"
env:
JAVA_HOME: $(JAVA_HOME_17_X64)
JAVA_HOME: $(JAVA_HOME_25_X64)
- script: python3 proj.py test 1
displayName: "Test vs master"
env:
JAVA_HOME: $(JAVA_HOME_17_X64)
JAVA_HOME: $(JAVA_HOME)
QDB_REPO_PATH: "./questdb"
condition: eq(variables.vsQuestDbMaster, true)
- job: TestsAgainstVariousNumpyVersion1x
Expand Down
133 changes: 133 additions & 0 deletions client-dataframe-findings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Client.dataframe findings

Scope: focused review of `questdb.ingress.Client.dataframe` after merging the
`c-questdb-client` submodule changes from PR #150. This deliberately ignores the
row-oriented `Buffer.dataframe`, `Sender.dataframe`, and transaction dataframe
paths except where they explain shared planner behavior.

## Current shape

`Client.dataframe` is the pooled QWP/WebSocket columnar ingestion path. It now
has two ingestion routes:

1. For fixed-table frames using a designated timestamp column name, first try
the Rust Arrow batch route when the symbol policy can be represented in
Arrow metadata:
`pyarrow.RecordBatch.from_pandas` -> `line_sender_buffer_append_arrow*` ->
`column_sender_flush_buffer` -> `column_sender_sync`.
2. If that route is not applicable or Rust rejects the frame before any flush,
fall back to the older Python dataframe planner:
`_FIELD_TARGETS_QWP` plan -> columnar-v1 validation -> prebuild object
columns -> chunk rows -> populate `column_sender_chunk` -> flush with
`column_sender_flush` -> finish with `column_sender_sync`.

Main implementation references:

- `src/questdb/ingress.pyx:3996` - public Arrow-route attempt.
- `src/questdb/ingress.pyx:4251` - public `Client.dataframe`.
- `src/questdb/ingress.pyx:4310` - Arrow route is tried before the manual
planner.
- `src/questdb/ingress.pyx:4323` - fallback plan build using
`_FIELD_TARGETS_QWP`.
- `src/questdb/ingress.pyx:2539` - v1 fixed-table / timestamp-column
constraints.
- `src/questdb/ingress.pyx:3486` - buffer flush helper for Arrow route.
- `src/questdb/line_sender.pxd:997` - `column_sender_flush_buffer` binding.

The buffer-level Arrow APIs are bound, exercised by an internal benchmark hook,
and now used by the public compatible route:

- `src/questdb/line_sender.pxd:205` - `line_sender_buffer_append_arrow`.
- `src/questdb/line_sender.pxd:213` - `line_sender_buffer_append_arrow_at_column`.
- `src/questdb/ingress.pyx:3612` - `_dataframe_append_arrow_record_batch`.
- `src/questdb/ingress.pyx:3674` - `_bench_dataframe_append_arrow_buffer`.
- `c-questdb-client/include/questdb/ingress/column_sender.h:631` - pooled
buffer flush FFI contract.
- `c-questdb-client/questdb-rs-ffi/src/column_sender.rs:1697` - FFI bridge.
- `c-questdb-client/questdb-rs/src/ingress/column_sender/sender.rs:141` -
Rust pooled buffer flush implementation.

## Findings

### 1. Public `Client.dataframe` still has partial Rust Arrow duplication

Status: partially resolved.

Resolved parts:

- Python now binds and maps the Arrow-specific C error codes.
- Plain `LargeUtf8` and categorical `LargeUtf8` are preserved instead of cast.
- The buffer-level Arrow APIs are bound and exercised by an internal benchmark
hook.
- The pooled Rust FFI path can now flush a `line_sender_buffer` through a
borrowed QWP/WebSocket connection.
- Public `Client.dataframe` now tries the Rust Arrow batch route before the
manual planner for fixed-table, timestamp-column-name frames using
`symbols='auto'`, `symbols=True`, or explicit symbol lists that do not need
categorical de-dictionarizing.
- Real QuestDB round-trip tests now cover `LargeUtf8`, categorical
`LargeUtf8`, and timestamp unit semantics.
- Public route tests now cover Rust-only Arrow numeric/timestamp cases:
`UInt8`, `UInt16`, `UInt64`, `Float16`, and `timestamp[ms, tz]`.
- Public route tests now cover explicit symbol-list routing for plain string
columns; Python marks the selected Arrow fields with `questdb.symbol=true`
and Rust builds the QWP SYMBOL dictionary.

Remaining issue:

The default compatible public path no longer relies on the Python dataframe
planner for Arrow classification, but the route is still intentionally narrow.
The manual planner still handles, and therefore still duplicates
classification for, non-default public shapes:

- `symbols=False`.
- explicit symbol lists where non-listed pandas categoricals would need to be
converted back to VARCHAR rather than auto-emitted as SYMBOL.
- `table_name_col`.
- non-string `at` values.
- frames that cannot be converted to one Arrow `RecordBatch`.
- frames Rust rejects before any flush and that are still valid under the older
Python compatibility surface.

The Rust Arrow classifier also supports more cases than the new public route has
real-server coverage for, including:

- Date, time, and duration values.
- `Utf8View` and binary variants.
- Raw Arrow dictionary symbols, distinct from pandas `CategoricalDtype`.
- Arrow `Float64` list arrays.

References:

- `c-questdb-client/questdb-rs/src/ingress/arrow.rs:1964` - Rust Arrow
classifier.
- `src/questdb/dataframe.pxi:1232` - Python Arrow resolver used by fallback
planner.
- `src/questdb/ingress.pyx:2539` - Python columnar validation used by fallback
planner.
- `src/questdb/ingress.pyx:3326` - Python per-column emission dispatch used by
fallback planner.
- `src/questdb/ingress.pyx:3996` - public Rust Arrow route.
- `test/test.py:272` - public QWP ack-server explicit-symbol route test.
- `test/system_test.py:3341` - real QuestDB explicit-symbol route test.
- `test/test_client_dataframe_fuzz.py:683` - local QWP dataframe fuzz for
route/fallback contracts.

Impact: new Rust Arrow ingestion capabilities now become public for the narrow
compatible route without a Python per-column emitter update, but broader public
shapes still need either more routing coverage or separate fallback planner
updates.

Recommended next step: add real-server round-trip coverage for more
Rust-classified families, then decide whether `table_name_col`, `symbols=False`,
or categorical de-dictionarizing are worth moving into the Arrow route.

## Suggested priority

1. Add real-server round-trip tests for the remaining Rust-classified families
before widening public claims for them.
2. Decide whether `table_name_col`, `symbols=False`, or categorical
de-dictionarizing should move to the Rust Arrow route.
3. Keep benchmarking large representative frames when route coverage changes;
the real-client benchmark now measures `client.dataframe()` without the old
manual preflight contamination.
51 changes: 38 additions & 13 deletions docs/conf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ The valid protocols are:
* ``tcps``: ILP/TCP with TLS
* ``http``: ILP/HTTP
* ``https``: ILP/HTTP with TLS
* ``qwpudp``: QWP/UDP (QuestWire Protocol over UDP)
* ``qwpws``: QWP/WebSocket
* ``qwpwss``: QWP/WebSocket with TLS

If you're unsure which protocol to use, see :ref:`sender_which_protocol`.

Expand Down Expand Up @@ -57,15 +60,25 @@ Connection
``host:port``.

This key-value pair is mandatory, but the port can be defaulted.
If omitted, the port will be defaulted to 9009 for TCP(s)
and 9000 for HTTP(s).
If omitted, the port will be defaulted to 9009 for TCP(s),
9000 for HTTP(s) and QWP/WebSocket, and 9007 for QWP/UDP.

* ``bind_interface`` - TCP/QWP-UDP only, ``str``: Network interface to bind
from. Useful if you have an accelerated network interface (e.g. Solarflare)
and want to use it.

* ``bind_interface`` - TCP-only, ``str``: Network interface to bind from.
Useful if you have an accelerated network interface (e.g. Solarflare) and
want to use it.

The default is ``0.0.0.0``.

* ``max_datagram_size`` - QWP/UDP-only, ``int > 0``: Maximum UDP datagram
payload size in bytes.

Default: 1400.

* ``multicast_ttl`` - QWP/UDP-only, ``int (0-255)``: Multicast TTL
(time-to-live) for UDP datagrams.

Default: 1.

.. _sender_conf_auth:

Authentication
Expand Down Expand Up @@ -104,7 +117,7 @@ See the :ref:`auth_and_tls_example` example for more details.
TLS
===

TLS in enabled by selecting the ``tcps`` or ``https`` protocol.
TLS is enabled by selecting the ``tcps``, ``https``, or ``qwpwss`` protocol.

See the `QuestDB enterprise TLS documentation <https://questdb.com/docs/operations/tls/>`_
on how to enable this feature in the server.
Expand Down Expand Up @@ -133,6 +146,12 @@ still use TLS by setting up a proxy in front of QuestDB, such as
* ``tls_roots`` - ``str``: Path to a PEM-encoded certificate authority file.
When used it defaults the ``tls_ca`` to ``'pem_file'``.

For ``qwpwss``, this can also point at a JKS or PKCS#12 keystore when
paired with ``tls_roots_password``.

* ``tls_roots_password`` - ``str``: Password for the JKS or PKCS#12 keystore
configured by ``tls_roots``. This is supported only for ``qwpwss``.

* ``tls_verify`` - ``'on'`` | ``'unsafe_off'``: Whether to verify the server's
certificate. This should only be used for testing as a last resort and never
used in production as it makes the connection vulnerable to man-in-the-middle
Expand Down Expand Up @@ -170,13 +189,13 @@ The following parameters control the :ref:`sender_auto_flush` behavior.

* ``auto_flush_rows`` - ``int > 0`` | ``'off'``: The number of rows that will
trigger a flush. Set to ``'off'`` to disable.
*Default: 75000 (HTTP) | 600 (TCP).*

*Default: 75000 (HTTP) | 600 (TCP, QWP/UDP).*

* ``auto_flush_bytes`` - ``int > 0`` | ``'off'``: The number of bytes that will
trigger a flush. Set to ``'off'`` to disable.
Default: ``'off'``.

*Default: off (TCP, HTTP) | max_datagram_size (QWP/UDP, 1400 by default).*

* ``auto_flush_interval`` - ``int > 0`` | ``'off'``: The time in milliseconds
that will trigger a flush. Set to ``'off'`` to disable.
Expand Down Expand Up @@ -228,6 +247,7 @@ Protocol Version
================

Specifies the version of InfluxDB Line Protocol to use.
Not applicable for QWP/UDP senders.

Here is a configuration string with ``protocol_version=2`` for ``TCP``::

Expand Down Expand Up @@ -281,11 +301,16 @@ The following parameters control the HTTP request behavior.

* ``retry_timeout`` - ``int > 0``: The time in milliseconds to continue retrying
after a failed HTTP request. The interval between retries is an exponential
backoff starting at 10ms and doubling after each failed attempt up to a
maximum of 1 second.
backoff starting at 10ms and doubling after each failed attempt up to
``retry_max_backoff_millis``.

Default: 10000 (10 seconds).

* ``retry_max_backoff_millis`` - ``int >= 10``: Maximum per-attempt backoff in
milliseconds for the HTTP retry loop.

Default: 1000 (1 second).

* ``request_timeout`` - ``int > 0``: The time in milliseconds to wait for a
response from the server. This is in addition to the calculation derived from
the ``request_min_throughput`` parameter.
Expand Down
13 changes: 13 additions & 0 deletions docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,19 @@ Examples
Basics
======

.. _qwp_udp_example:

QWP over UDP
------------

The following example sends a row using QuestWire Protocol over UDP.

Requires a QuestDB instance with QWP/UDP receiver support enabled. The
default listener port is ``9007``.

.. literalinclude:: ../examples/qwp_udp.py
:language: python

HTTP with Token Auth
--------------------

Expand Down
Loading
Loading