Skip to content

feat: Add contract enforcement for incremental models#674

Open
Christy1984 wants to merge 3 commits intoaws-samples:mainfrom
Christy1984:fix-incremental-contract-validation
Open

feat: Add contract enforcement for incremental models#674
Christy1984 wants to merge 3 commits intoaws-samples:mainfrom
Christy1984:fix-incremental-contract-validation

Conversation

@Christy1984
Copy link
Copy Markdown

Add contract enforcement for incremental models

Problem

Contract enforcement in dbt-glue was only working for table creation (initial run or full refresh). For incremental runs, contracts were completely ignored, allowing mismatched columns to silently pass through.

Current Behavior (Broken)

✅ Full refresh with extra column → FAILS (contract validated)
❌ Incremental run with extra column → SUCCEEDS (no validation!)
❌ `on_schema_change='append_new_columns'` → Ignores contract
❌ `on_schema_change='fail'` → Ignores contract

This meant contract enforcement was misleading - users thought their schema was protected, but it only worked on the first run.

Root Cause

  1. No validation in incremental path: The incremental materialization never called any contract validation logic after creating temp tables.

  2. Temp view schema issue: Attempting to use adapter.get_columns_in_relation() on temporary views failed because it tried to describe them with a schema prefix (e.g., schema.table_tmp), but Spark temporary views are session-scoped and have no schema.

Solution

Changes Made

1. New Macro: validate_incremental_contract.sql

Location: dbt/include/glue/macros/materializations/incremental/validate_contract.sql

Creates a new validation macro that:

  • Queries the temp relation using run_query("SELECT * FROM ... WHERE 1=0") to get schema without the prefix issue
  • Compares SQL columns vs contract YAML columns
  • Handles both temp views (no schema) and temp tables (with schema)
  • Raises compiler error with clear table format matching dbt core style
  • Dynamically adjusts column widths for proper alignment

2. Modified: incremental.sql

Location: dbt/include/glue/macros/materializations/incremental/incremental.sql

Added validation calls after temp table/view creation:

  • Line ~79: For iceberg/s3tables with schema change modes
  • Line ~91: For non-iceberg formats
{%- set is_tmp_relation_created = 'True' -%}
{%- do validate_incremental_contract(tmp_relation, target_relation) -%}  {# NEW #}
{%- do process_schema_changes(on_schema_change, tmp_relation, target_relation) -%}

Error Message Format

Matches dbt core contract validation format:

This model has an enforced contract that failed.
Please ensure the name, data_type, and number of columns in your contract match the columns in your model's definition.

| column_name        | definition_type | contract_type | mismatch_reason       |
| ------------------ | --------------- | ------------- | --------------------- |
| extra_col          | STRING          |               | missing in contract   |
| missing_col        |                 | INTEGER       | missing in definition |

Testing

Test Environment

  • dbt-core: 1.10.15
  • dbt-glue: 1.10.15 + patches
  • AWS Glue: Interactive sessions
  • File formats tested: Iceberg, Parquet

Test Cases

Test 1: Extra Column ✅

Setup:

# Contract defines 3 columns
columns:
  - name: col_a
  - name: col_b
  - name: update_iceberg_ts
-- SQL produces 4 columns
SELECT col_a, col_b, update_iceberg_ts, extra_col

Result: ✅ Build FAILS with contract error showing extra_col | ... | | missing in contract

Test 2: Missing Column ✅

Setup:

# Contract defines 4 columns including missing_col
-- SQL only produces 3 columns (missing_col not selected)

Result: ✅ Build FAILS with contract error showing missing_col | | STRING | missing in definition

Test 3: Matching Columns ✅

Setup: SQL columns exactly match contract YAML

Result: ✅ Build SUCCEEDS

Test 4: Long Column Names ✅

Setup: Column name update_iceberg_tss (18 characters)

Result: ✅ Table formatting dynamically adjusts width, maintains alignment

Test 5: Contract Not Enforced ✅

Setup: contract.enforced: false or no contract

Result: ✅ Validation skipped (backward compatible)

Test 6: All File Formats ✅

Tested with:

  • ✅ Iceberg tables
  • ✅ Parquet tables
  • ✅ Temp views (session-scoped)
  • ✅ Temp tables (with schema)

Before/After Comparison

Scenario Before (Broken) After (Fixed)
Table creation with mismatch ✅ Fails ✅ Fails
Incremental with extra column ❌ Succeeds ✅ Fails
Incremental with missing column ❌ Succeeds ✅ Fails
on_schema_change='append_new_columns' ❌ Ignores contract ✅ Validates contract
on_schema_change='fail' ❌ Ignores contract ✅ Validates contract
No contract enforced ✅ Works ✅ Works

Backward Compatibility

Fully backward compatible

  • Only validates when contract.enforced: true is set
  • No impact on models without contracts
  • No changes to existing behavior for table creation
  • No Python code changes - macros only

Benefits

  1. Catches schema drift early - Prevents bad data from entering tables
  2. Consistent validation - Works the same for both table creation and incremental runs
  3. Clear error messages - Shows exactly what's wrong in a readable table format
  4. Better data quality - Ensures contract guarantees are actually enforced
  5. Production safety - Blocks builds before bad schema changes reach production

Related Issues

This addresses the gap in contract enforcement that has existed since contracts were introduced in dbt core. The dbt-glue adapter comment at line 178-180 of adapters.sql explicitly stated:

{# -- This does not enforce constraints and needs to be a TODO #}

This PR completes that TODO for incremental materializations.

Checklist

  • Macro changes only (no Python code modifications)
  • Tested with multiple file formats (Iceberg, Parquet)
  • Tested with temp views and temp tables
  • Error messages match dbt core format
  • Backward compatible (no breaking changes)
  • Dynamic column width calculation for clean formatting
  • Handles all on_schema_change modes
  • Validates only when contract.enforced is true

Christy1984 and others added 2 commits March 19, 2026 13:10
…al models

This fix addresses an issue where incremental models with `contract.enforced: true`
could be configured with `on_schema_change: ignore`, which violates dbt-core's
contract enforcement requirements.

Changes:
- Added `dbt_glue_validate_contract_with_schema_change()` macro in validate.sql
  that checks if contract is enforced and on_schema_change is 'ignore', raising
  a compiler error with a clear message
- Updated incremental.sql to call the validation macro early in the
  materialization process
- Added comprehensive unit tests for the validation logic
- Updated CHANGELOG.md to document the fix

The validation ensures that models with enforced contracts must use either
'append_new_columns' or 'fail' for on_schema_change, aligning with dbt-core
behavior and preventing confusing runtime errors.

Fixes the error: "Invalid value for on_schema_change: ignore. Models materialized
as incremental with contracts enabled must set on_schema_change to
'append_new_columns' or 'fail'"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add validation in incremental materialization path
- Fix temp view schema prefix issue
- Use run_query() instead of get_columns_in_relation()
- Validates both iceberg and non-iceberg formats
- Dynamic table formatting with proper alignment
- Clear error messages showing mismatches
The incremental contract validation was only checking column names (extra/missing)
but not data types, allowing type mismatches to pass silently. Now delegates to
dbt-core's get_assert_columns_equivalent() for full name + type validation,
consistent with table/view materializations. Also fixes contract config access
pattern in validate.sql and bumps version to 1.10.20.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@yotahk yotahk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Christy1984 Thanks for the contribution. PR #662 was merged into main and modifies the same section of incremental.sql (lines 69-80) so this PR has merge conflicts. Please rebase and resolve conflicts. After rebasing, please make sure validate_incremental_contract() is inside the {%- if language != 'python' -%} guard that #662 added, otherwise I believe the validation will run on Python models where the temp view doesn't exist.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the 1.10.19 → 1.10.20 change from this PR. We handle version bumps at release time to avoid conflicts between PRs.


{% if contract_enforced and on_schema_change == 'ignore' %}
{% set invalid_contract_schema_change_msg -%}
Invalid value for on_schema_change: {{ on_schema_change }}.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says 'append_new_columns' or 'fail', but the code only rejects ignore, meaning sync_all_columns is silently allowed. Since dbt-glue supports sync_all_columns (it's explicitly handled in the iceberg/s3tables path of incremental.sql), this is correct behavior, but the message is misleading. Would you update the message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants