You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The query API enforces its SQL-safety invariants — RBAC table extraction, SHOW-command gating, cross-database rejection, dangerous-keyword denylist, multi-statement rejection, read_parquet/arc_partition_agg blocking — with regex and string-scanning over the raw SQL, not a real parser. This is an inherently leaky abstraction: each invariant has to independently re-derive "what is a token / literal / comment / statement boundary," and every divergence from real SQL tokenization is a potential bypass.
PR #489 (a small SHOW-gating change) made this concrete. It took 7 Gemini rounds, each surfacing a distinct tokenizer edge case in the same gate:
Comment hides the command — /* x */ SHOW DATABASES (regex on raw SQL).
Quoted db name not recognized — SHOW TABLES FROM "db" (char class too narrow).
Path traversal — SHOW TABLES FROM .. (char class too wide; no identifier validation).
Multi-statement smuggling — SHOW DATABASES; SELECT 1 (anchored $ assumes one statement).
Whitespace before paren in TVF detection — generate_series (...).
Endpoint parity — Arrow endpoint had no SHOW gate at all.
Comment marker inside a quoted identifier — SHOW TABLES FROM "my--db" (stripped comments before masking literals → truncation → wrong db authorized).
Each was individually fixed and is covered by a regression test, and the defensive posture held throughout (Arc databases are virtual / never ATTACHed, so a missed SHOW returns DuckDB's empty catalog, not tenant data; the denylist blocks destructive statements). So these were defense-in-depth hardening, not live tenant-data leaks. But the rate of new edge cases in one small PR is the signal: the approach doesn't converge.
Proposed direction
Treat ValidateSQLRequest as the single SQL-safety chokepoint and make the SQL handling robust rather than regex-by-feature:
Normalize once, centrally. All endpoints already call ValidateSQLRequest. Do the literal-masking + comment-stripping there once, and have it expose the normalized form (and/or a small set of derived facts: statement count, leading keyword, whether it's a SHOW/metadata command) so downstream gates don't each re-scan raw SQL. The normalizeSQLForShow helper added in fix(api): add SHOW command RBAC-gating to estimateQuery endpoint #489 is a step toward this — generalize it.
Move SHOW / metadata detection off raw-regex onto the normalized form (or a lightweight statement classifier). The anchored-regex approach keeps sprouting edge cases (rounds 1, 2, 4, 7 were all "the regex saw the wrong string").
Evaluate a real parser/lexer for the safety layer. DuckDB's own parser, a Go SQL tokenizer, or EXPLAIN-based statement classification would replace the hand-rolled token scanning that RBAC depends on. Scope/perf to be assessed — the hot path matters (current extraction is 6–30 µs).
Property/fuzz test the chokepoint. A fuzz target over ValidateSQLRequest + the SHOW gate (comments, quotes, mixed quoting, statement separators, unicode, whitespace) would surface these classes before review does.
This is the durable fix for the "whack-a-mole with Gemini on tokenizer nits" pattern. Not urgent (no known live bypass), but it's the root cause behind a recurring review cost.
Acceptance criteria (direction-setting; refine when scheduled)
SQL-safety invariants (RBAC extraction, SHOW gating, statement-count, denylist) operate on a single, centrally-produced normalized representation — no gate re-scans raw req.SQL.
SHOW/metadata-command detection no longer depends on anchored regex over raw SQL.
A fuzz/property test covers the normalization + gating against comment/quote/separator/whitespace permutations.
No regression in the existing query_rbac_test.go / TestValidateSQLRequest_* / TestShowTablesPattern / TestNormalizeSQLForShow suites.
Problem
The query API enforces its SQL-safety invariants — RBAC table extraction, SHOW-command gating, cross-database rejection, dangerous-keyword denylist, multi-statement rejection,
read_parquet/arc_partition_aggblocking — with regex and string-scanning over the raw SQL, not a real parser. This is an inherently leaky abstraction: each invariant has to independently re-derive "what is a token / literal / comment / statement boundary," and every divergence from real SQL tokenization is a potential bypass.PR #489 (a small SHOW-gating change) made this concrete. It took 7 Gemini rounds, each surfacing a distinct tokenizer edge case in the same gate:
/* x */ SHOW DATABASES(regex on raw SQL).SHOW TABLES FROM "db"(char class too narrow).SHOW TABLES FROM ..(char class too wide; no identifier validation).SHOW DATABASES; SELECT 1(anchored$assumes one statement).generate_series (...).SHOW TABLES FROM "my--db"(stripped comments before masking literals → truncation → wrong db authorized).Each was individually fixed and is covered by a regression test, and the defensive posture held throughout (Arc databases are virtual / never
ATTACHed, so a missed SHOW returns DuckDB's empty catalog, not tenant data; the denylist blocks destructive statements). So these were defense-in-depth hardening, not live tenant-data leaks. But the rate of new edge cases in one small PR is the signal: the approach doesn't converge.Proposed direction
Treat
ValidateSQLRequestas the single SQL-safety chokepoint and make the SQL handling robust rather than regex-by-feature:ValidateSQLRequest. Do the literal-masking + comment-stripping there once, and have it expose the normalized form (and/or a small set of derived facts: statement count, leading keyword, whether it's a SHOW/metadata command) so downstream gates don't each re-scan raw SQL. ThenormalizeSQLForShowhelper added in fix(api): add SHOW command RBAC-gating to estimateQuery endpoint #489 is a step toward this — generalize it.EXPLAIN-based statement classification would replace the hand-rolled token scanning that RBAC depends on. Scope/perf to be assessed — the hot path matters (current extraction is 6–30 µs).ValidateSQLRequest+ the SHOW gate (comments, quotes, mixed quoting, statement separators, unicode, whitespace) would surface these classes before review does.Relationship to other work
duckdb_arrowthe default so defaultgo testexercises the shipped path — round 6's Arrow-parity gap was invisible to untagged tests).Acceptance criteria (direction-setting; refine when scheduled)
req.SQL.query_rbac_test.go/TestValidateSQLRequest_*/TestShowTablesPattern/TestNormalizeSQLForShowsuites.