Skip to content

Add support for DML statements (INSERT/UPDATE/DELETE) as CTE bodies#34

Merged
krleonid merged 12 commits into
mainfrom
feature/dml-cte-support
Apr 6, 2026
Merged

Add support for DML statements (INSERT/UPDATE/DELETE) as CTE bodies#34
krleonid merged 12 commits into
mainfrom
feature/dml-cte-support

Conversation

@krleonid
Copy link
Copy Markdown
Owner

Allow INSERT, UPDATE, and DELETE statements with optional RETURNING clauses to be used as CTE bodies in WITH expressions. Previously, CTEs only supported SELECT statements.

Key changes:

  • Widen CommonTableExpressionInfo::query from SelectStatement to SQLStatement to accept DML statement types alongside SELECT
  • Add GetQueryNode() helper to uniformly access the root QueryNode regardless of the underlying statement type
  • Transform DML statements (INSERT/UPDATE/DELETE) in CTE position during parsing, rejecting unsupported types like DDL and COPY
  • Prevent recursive CTEs from using DML bodies (not semantically meaningful)
  • Preserve DML CTE execution for side effects even when unreferenced, by skipping CTE inlining elimination for DML operators and force-binding unreferenced DML CTEs in the planner
  • Update serialization with a new optional field (105: dml_query) for forward/backward compatible storage of DML CTE bodies
  • Add comprehensive tests for DML CTEs, including unreferenced side-effect execution, error cases, and the previously failing issue INTERNAL Error: A CTE needs a SELECT duckdb/duckdb#3417

Made-with: Cursor

@krleonid krleonid force-pushed the feature/dml-cte-support branch 3 times, most recently from f0faf23 to 24f41ef Compare March 26, 2026 13:13
@krleonid krleonid force-pushed the feature/dml-cte-support branch 8 times, most recently from b021cce to f7795a2 Compare April 3, 2026 09:21
Allow INSERT, UPDATE, and DELETE statements with optional RETURNING clauses
to be used as CTE bodies in WITH expressions. Previously, CTEs only supported
SELECT statements.

Key changes:
- Widen CommonTableExpressionInfo::query from SelectStatement to SQLStatement
  to accept DML statement types alongside SELECT
- Add GetQueryNode() helper to uniformly access the root QueryNode regardless
  of the underlying statement type
- Transform DML statements (INSERT/UPDATE/DELETE) in CTE position during
  parsing, rejecting unsupported types like DDL and COPY
- Prevent recursive CTEs from using DML bodies (not semantically meaningful)
- Preserve DML CTE execution for side effects even when unreferenced, by
  skipping CTE inlining elimination for DML operators and force-binding
  unreferenced DML CTEs in the planner
- Update serialization with a new optional field (105: dml_query) for
  forward/backward compatible storage of DML CTE bodies
- Add comprehensive tests for DML CTEs, including unreferenced side-effect
  execution, error cases, and the previously failing issue duckdb#3417

Made-with: Cursor
@krleonid krleonid force-pushed the feature/dml-cte-support branch from f7795a2 to 1538ffc Compare April 3, 2026 10:39
krleonid and others added 11 commits April 3, 2026 15:25
This allows relaxing the restriction on a per-catalog basis in the future.

Made-with: Cursor
Data-modifying CTEs (INSERT/UPDATE/DELETE) are now rejected when they
appear inside subqueries or nested WITH clauses, matching Postgres
behavior. Also blocks the INSERT INTO t WITH ... SELECT form since
Postgres only supports WITH ... INSERT INTO.

Made-with: Cursor
Replace hand-written CTE Serialize/Deserialize with generator output.
Field 101 (SelectStatement) is read_only for legacy storage; field 106
(query_node) is always serialized. Add deserialization constructor that
prefers query_node over legacy query->node.

Made-with: Cursor
Resolve sqlsmith fix.patch: combine upstream window/LEAD-LAG changes
with DML CTE updates (query_node in GenerateCTEs, Simplify(*query_node)).

Made-with: Cursor
Tag field 106 with version v2.0.0 so generated Serialize wraps query_node
in ShouldSerialize(8), matching the rest of the v2 format.

Made-with: Cursor
…plify

Simplify(CommonTableExpressionMap&) called Simplify(*cte_child.second->query_node),
passing QueryNode& — but sqlsmith only has Simplify(unique_ptr<QueryNode>&), not
Simplify(QueryNode&). Remove the dereference so the unique_ptr is passed directly,
matching the existing overload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@krleonid krleonid merged commit c74d4c2 into main Apr 6, 2026
124 of 145 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant