Skip to content

Use intermediate-and-swap pattern in full-refresh incremental path#398

Open
sdebruyn wants to merge 1 commit into
microsoft:mainfrom
sdebruyn:full-refresh-swap
Open

Use intermediate-and-swap pattern in full-refresh incremental path#398
sdebruyn wants to merge 1 commit into
microsoft:mainfrom
sdebruyn:full-refresh-swap

Conversation

@sdebruyn
Copy link
Copy Markdown
Collaborator

Fixes #397.

The full-refresh branch of fabric__incremental dropped the existing target before re-creating it, so a transient Fabric error, query timeout, capacity throttling event, OOM, or network hiccup during the CTAS step left the user with no target table at all. Downstream BI dashboards and reports point at nothing until the next dbt run completes successfully.

Change

Adopt the intermediate-relation + backup-and-swap pattern that dbt-postgres, dbt-snowflake, dbt-spark, and dbt-bigquery use:

  1. Build the new table into an intermediate relation (make_intermediate_relation(target_relation)). If this fails, the existing target stays untouched.
  2. Rename the existing target to a backup relation.
  3. Rename the intermediate to the target.
  4. Drop the backup.

Leftover intermediate / backup / temp-view relations from a previous failed run are dropped before the build, matching the existing cleanup pattern.

Testing

Happy to extend the change with an integration test that injects a CTAS failure during --full-refresh and asserts the previous target is still queryable, if that's the preferred follow-up.

The full-refresh branch of fabric__incremental dropped the existing
target before re-creating it. If the subsequent CREATE TABLE AS SELECT
failed for any reason — transient Fabric error, query timeout, broken
model SQL, capacity throttling, OOM, network hiccup — the user was left
with no target table at all.

Build the new table into an intermediate relation first, then rename the
existing target to a backup, rename the intermediate to the target, and
drop the backup. If the build fails, the existing target stays in place.
This is the same pattern dbt-postgres, dbt-snowflake, and dbt-spark use.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incremental --full-refresh drops the target before recreating it — data loss if creation fails

1 participant