Skip to content

perf(inlet): speed up pull_chunk with bulk buffer slicing#111

Merged
cboulay merged 4 commits into
labstreaminglayer:mainfrom
sappelhoff:perf/pull-chunk-bulk-slice
Jun 17, 2026
Merged

perf(inlet): speed up pull_chunk with bulk buffer slicing#111
cboulay merged 4 commits into
labstreaminglayer:mainfrom
sappelhoff:perf/pull-chunk-bulk-slice

Conversation

@sappelhoff

Copy link
Copy Markdown
Contributor

(PR and content generated with the help of Claude)

What

Speed up StreamInlet.pull_chunk — the main data-receive hot path — without changing its behavior or output.

Why

The current implementation reads the ctypes data/timestamp buffers one element at a time inside a nested list comprehension. Each data_buff[i] access crosses the ctypes boundary individually, which is slow. A single bulk slice (data_buff[:n]) converts the whole buffer in one C-level pass, and we then split it into per-sample lists in Python.

It also replaces num_elements / num_channels (float) + repeated int(...) truncation with integer floor division.

Impact

Measured on the extraction step alone (the changed code), output verified identical via assert old() == new():

shape old new speedup
8ch x 1024 0.71 ms 0.23 ms 3.1x
32ch x 1024 2.43 ms 0.68 ms 3.6x
64ch x 512 2.27 ms 0.56 ms 4.0x
256ch x 256 4.54 ms 1.13 ms 4.0x
1ch x 4096 0.92 ms 0.62 ms 1.5x

~3-4x for typical multi-channel chunks; smaller for single-channel streams. This is the Python-side extraction only — end-to-end gain depends on how extraction-bound the receive loop is. The cf_string path still pays per-element .decode(), so its relative gain is smaller.

Behavior

Byte-identical output: same list[list] of values, same timestamp list type, unchanged cf_string decoding and free_char_p_array_memory call, unchanged dest_obj path. Verified with a live localhost round-trip across the numeric, string, and dest_obj paths; existing test suite passes.

Comment thread src/pylsl/inlet.py Outdated
if dest_obj is None:
# Convert the whole ctypes buffer to a Python list in a single
# bulk slice (far faster than indexing the array element by
# element), then split it into one list per sample.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to include the verbose comment. The reason for the change is in the git history and not needed in the code.

@sappelhoff sappelhoff Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. I removed it

Comment thread src/pylsl/inlet.py
Convert the ctypes data and timestamp buffers to Python lists with a
single bulk slice instead of indexing element-by-element inside a nested
comprehension. This is ~3-4x faster at extracting multi-channel chunks
(measured on the extraction step alone) and produces byte-identical
output. Also use integer floor division for the sample count instead of
float division plus repeated int() truncation.
The rationale lives in the commit history; the code itself does not need
it. Addresses review feedback on labstreaminglayer#111.
Cover the two paths the bulk-slice extraction must preserve: a
multi-channel numeric chunk and a variable-length string chunk. Pushes a
known chunk and pulls it back, asserting identical values, list[list]
shape, and timestamp list type. The string case (empty and multi-byte
values) exercises the cf_string decode path that previously lacked
coverage.
.claude/ and CLAUDE.md are local tooling artifacts that should not be
tracked.
@sappelhoff sappelhoff force-pushed the perf/pull-chunk-bulk-slice branch from 665a42f to 7948d36 Compare June 16, 2026 08:34
@sappelhoff sappelhoff requested a review from cboulay June 16, 2026 08:37
@cboulay cboulay merged commit e146d7d into labstreaminglayer:main Jun 17, 2026
19 checks passed
@sappelhoff sappelhoff deleted the perf/pull-chunk-bulk-slice branch June 17, 2026 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants