Skip to content

Wrap model selection in sub-select to handle WHERE clauses#103

Open
sti0 wants to merge 1 commit into
Snowflake-Labs:mainfrom
sti0:bugfix/constraint-test-subquery-where
Open

Wrap model selection in sub-select to handle WHERE clauses#103
sti0 wants to merge 1 commit into
Snowflake-Labs:mainfrom
sti0:bugfix/constraint-test-subquery-where

Conversation

@sti0
Copy link
Copy Markdown

@sti0 sti0 commented Sep 9, 2025

fixes

Example:

models:
 - name: table
   columns:
      - name: id
        data_tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref("parent_table")
              pk_column_name: id
              config:
                where: is_current
select validation_errors.* from (
    select
        fk_child.*
    from (
        select
            fk_child_inner.<<id>>
        from (select * from (select * from <<table>> where is_current) dbt_subquery) as fk_child_inner
        where 1=1
            and fk_child_inner.<<id>> is not null
            ) fk_child
    left join (
        select
            fk_parent_inner..<<id>>
        from <<parent_table>> fk_parent_inner
        ) fk_parent
            on fk_child.<<id>> = fk_parent.<<id>>

    where fk_parent.<<id>> is null
) validation_errors

@sfc-gh-dflippo
Copy link
Copy Markdown
Collaborator

1.0.9 status update

We've spent time evaluating this PR for inclusion in the 1.0.9 release and concluded the unconditional subquery wrap creates a real risk of query-plan regression on enforcing databases (Postgres, Oracle, SQL Server) that we don't want to ship without benchmarks.

The 1.0.9 release plan is therefore:

Plan for this PR (#103) - moved to the 1.0.10 milestone:

  1. Build a 10M-row child / 1M-row parent fixture under integration_tests/dbt-core/models/pr103_perf/ running on testcontainers Postgres.
  2. Capture EXPLAIN ANALYZE (Postgres), EXPLAIN PLAN FOR (Oracle), SET SHOWPLAN_ALL ON (SQL Server), and EXPLAIN (Snowflake) for the UK and FK test SQL with and without config.where, both on main and on this branch.
  3. Research a Hypothesis B alternative: instead of wrapping the model in a subquery, push the test's config.where predicate into the FK JOIN condition. This may not be expressible because dbt-core injects config.where before the package macro receives model, but it's worth trying because it preserves index/partition use on the base relation.
  4. Decision matrix:
    • Hypothesis B works on all DBs -> build a B-flavor PR (credit @sti0); supersede this PR.
    • Hypothesis B doesn't work; A is acceptable with conditional wrap (config.where set only) -> rework this PR to scope the wrap; ship.
    • Both regress significantly -> close this PR with a documented "use a manual sql test" workaround for users who need config.where on constraint tests.

What we will NOT do:

  • Merge this PR as-is into 1.0.9 unconditionally - the user-reported pattern of subquery wraps regressing Postgres execution plans is exactly the failure mode our maintainer flagged.

If you want to drive this forward, the conditional-wrap fallback is straightforward:

{%- set has_where = model.config.get('where') %}
{%- if has_where %}
    from (select * from {{ model }}) as fk_child_inner
{%- else %}
    from {{ model }} fk_child_inner
{%- endif %}

This adds zero perf cost in the 99% case (no where clause) while preserving the correctness fix for the 1% with one. We'd be happy to land that variation if you'd prefer.

Thanks @sti0 for the original report and patch. The bug is real (#119, #100, #98 all duplicate it) - we just want to make sure the fix doesn't trade one problem for another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants