Skip to content

refactor(query): extract optional query paths into support crates#19644

Open
KKould wants to merge 28 commits intodatabendlabs:mainfrom
KKould:feat/query-retained-gates
Open

refactor(query): extract optional query paths into support crates#19644
KKould wants to merge 28 commits intodatabendlabs:mainfrom
KKould:feat/query-retained-gates

Conversation

@KKould
Copy link
Copy Markdown
Member

@KKould KKould commented Mar 30, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  • extract the optional task-support, script-udf, and storage-stage query paths into dedicated support crates
  • wire databend-query feature gates to those support crates so toggling a feature removes the corresponding dependency edge instead of relying on more fragile in-crate cfg wiring
  • keep the default databend-query behavior aligned with main while making lean development builds cheaper to compile
  • rerun the stable cold-build benchmark serially after the support-crate refactor and refresh the comparison table

Changes

Compared with main, this PR changes how a few heavier optional query paths are organized in the workspace.

Instead of keeping the feature-specific implementations directly inside the main query service crate, this PR moves them into dedicated support crates and keeps databend-query responsible only for the thin registration and bridge points. That makes the optional dependency boundaries much clearer, reduces the chance of breaking feature combinations with scattered cfg blocks, and keeps the default build behavior unchanged.

The concrete changes are:

  • add src/query/task_support for task interpreters, task table functions, and task-related system tables
  • add src/query/script_udf_support for script UDF runtime, transform, and UDAF integrations
  • add src/query/storage_stage_support as the feature-gated boundary for stage-storage integration
  • update databend-query and databend-query/service to consume those support crates through feature-gated dependencies instead of directly owning the full implementations

Implementation

  1. Add dedicated support crates for the optional query paths that are most relevant to lean development builds.
  2. Move task-related implementations into src/query/task_support, leaving only the service-side context and registration bridge in databend-query/service.
  3. Move script UDF runtime, transform, and UDAF support into src/query/script_udf_support, and re-export the required entry points back into the service layer.
  4. Add src/query/storage_stage_support so the stage-storage path is isolated behind a narrower feature-gated crate boundary.
  5. Keep the default feature set aligned with the current product behavior on main, while allowing smaller dependency graphs when those optional paths are not needed during development.

Stable Compile Benchmark

Benchmark shape:

  • crate: databend-query
  • cleanup: run make clean before the benchmark
  • method: cold builds with an isolated target-dir per case
  • execution: fully serial, no parallel cargo runs
  • command shape: cargo check -p databend-query --lib
  • single-feature cases: start from --no-default-features --features simd and add exactly one optional path
  • comparison cases: current default feature set and --all-features
  • stability check: run the simd baseline twice, once at the beginning and once at the end

Benchmark environment:

env RUSTC_WRAPPER= CARGO_INCREMENTAL=0 CARGO_BUILD_JOBS=1 cargo check ...

The two simd baselines were:

  • 801.90s
  • 793.52s

So I use their mean, 797.71s, as the reference point, and treat roughly +-4.19s as the low-confidence noise band for this run.

Results:

case command total time delta vs simd_mean delta %
simd_baseline cargo check -p databend-query --lib --no-default-features --features simd 801.90s +4.19s +0.53%
script-udf cargo check -p databend-query --lib --no-default-features --features simd,script-udf 860.24s +62.53s +7.84%
task-support cargo check -p databend-query --lib --no-default-features --features simd,task-support 812.91s +15.20s +1.91%
storage-stage cargo check -p databend-query --lib --no-default-features --features simd,storage-stage 1206.29s +408.58s +51.22%
default cargo check -p databend-query --lib 1275.07s +477.36s +59.84%
all-features cargo check -p databend-query --lib --all-features 1311.99s +514.28s +64.47%
simd_repeat cargo check -p databend-query --lib --no-default-features --features simd 793.52s -4.19s -0.53%

Reviewer-facing takeaways:

  • storage-stage is still the dominant optional path by a wide margin: +408.58s vs the mean baseline, about +51.22%
  • the current default path is still mostly explained by storage-stage; default is only about 68.78s above storage-stage alone
  • script-udf has a clear standalone cost of +62.53s, about +7.84%
  • task-support is now a small but measurable standalone cost at +15.20s, about +1.91%, still far below storage-stage and script-udf
  • --all-features is heavier than the default path by about 36.92s, so most of the compile weight is already present in the default feature set

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Pair with the reviewer to explain why

Type of change

  • Refactor
  • CI change
  • Bug fix
  • New feature

Risks

  • Optional feature combinations may still reveal hidden dependencies across the new support-crate boundaries that were previously masked by the monolithic default build graph.
  • Future changes can still erode the compile-time win if feature-specific code starts leaking back into the always-on query service path.

This change is Reviewable

@github-actions github-actions bot added the pr-build this PR changes build/testing/ci steps label Mar 30, 2026
@KKould KKould changed the title ci(query): slim optional feature builds refactor(query): split optional features to slim compile graph Mar 30, 2026
@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Mar 30, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7345132e6c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@KKould KKould marked this pull request as draft March 30, 2026 21:49
@KKould KKould changed the title refactor(query): split optional features to slim compile graph refactor(query): retain optional gates for lean dev builds Mar 30, 2026
@KKould KKould force-pushed the feat/query-retained-gates branch 2 times, most recently from 1dcabb6 to bb4f71d Compare March 31, 2026 12:23
@KKould KKould force-pushed the feat/query-retained-gates branch from bb4f71d to cecb3e4 Compare March 31, 2026 13:16
@KKould KKould self-assigned this Mar 31, 2026
@KKould KKould requested review from b41sh, sundy-li and zhang2014 March 31, 2026 16:56
@KKould
Copy link
Copy Markdown
Member Author

KKould commented Mar 31, 2026

@codex review

@KKould KKould removed the request for review from b41sh March 31, 2026 22:07
@KKould KKould marked this pull request as ready for review March 31, 2026 23:36
@forsaken628
Copy link
Copy Markdown
Collaborator

forsaken628 commented Apr 2, 2026

Have you considered incremental compilation? The initial compilation time isn’t that critical; what matters is how long it takes to compile after making minor changes.It’s worth noting that the compilation outputs for different features are completely non-reusable.

@KKould KKould changed the title refactor(query): retain optional gates for lean dev builds refactor(query): isolate retained gates into support crates Apr 2, 2026
@KKould KKould changed the title refactor(query): isolate retained gates into support crates refactor(query): extract optional query paths into support crates Apr 2, 2026
@KKould KKould requested a review from forsaken628 April 3, 2026 13:33
@KKould KKould requested a review from b41sh April 3, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-build this PR changes build/testing/ci steps pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants