Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions .github/workflows/ci-unit-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: CI Unit Tests

on:
push:
branches:
- main
- 'releases/**'
pull_request:
branches:
- main
- 'releases/**'
workflow_dispatch:

jobs:
unit-tests:
name: Unit tests (${{ matrix.os }} / Python ${{ matrix.python }})
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python: ['3.10', '3.11', '3.12', '3.13']
runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}

- name: Install dependencies
shell: bash
run: |
python -m pip install --upgrade pip
pip install -r requirements_dev.txt

- name: Run unit tests
shell: bash
run: |
pytest tests/unit/ -v
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
## dbt-teradata 1.0.0a

### Features
- Open Table Format (OTF) support via Teradata DATALAKE objects. Models can target Iceberg/Delta Lake tables using 3-part naming (`<datalake>."<otf_db>"."<table>"`) by setting `catalog_name` and registering a `datalake` catalog integration in `catalogs.yml`. Supports model-level `partitioned_by`, `sorted_by`, `tblproperties`, and `purge_mode` configs. OTF tables defined as sources in `sources.yml` (with differing `database` and `schema`) are auto-detected and rendered with 3-part naming.
- `grants`, `persist_docs`, and adapter cache management are now applied to OTF table materializations, matching the standard table materialization.

### Fixes
- `purge_mode` default changed from `PURGE ALL` to `NO PURGE` to avoid data loss on first DROP of an OTF model whose target table already holds data.
- `purge_mode` validation is now case-insensitive and centralised; invalid values raise a clear compile-time error.
- `add_query` now matches `DROP/DELETE DATABASE` case-insensitively, consistent with the `DROP TABLE/VIEW` branch.

### Docs

### Under the hood
- Adapter suppresses Teradata error 7825 ("OTF table not found in external catalog") on `DROP TABLE /*+ IF EXISTS */`, matching the existing semantics for errors 3807/3853/3854.
- OTF DROP logic deduplicated across `teradata__drop_relation` and `teradata__create_datalake_table_as` via shared `teradata__drop_otf_table` / `teradata__validate_purge_mode` helper macros.
- Unsupported config combinations on the OTF path (`table_kind`, `table_option`, `with_statistics`, `index`, `contract.enforced`) now raise compile-time errors instead of being silently ignored.
- `TeradataRelation.render()` for OTF relations no longer relies on the private `BaseRelation._render_iterator()` API.
115 changes: 115 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -851,6 +851,121 @@ sources:
data_type: CHAR(1)
```

## Open Table Format (OTF) support

dbt-teradata can create and read Iceberg / Delta Lake tables via Teradata's native Open Table Format support. OTF tables live in an external object store (S3, Azure, GCS) and are registered with an external catalog (AWS Glue, Unity Catalog, etc.); Teradata accesses them through a pre-created `DATALAKE` object that encapsulates the catalog type, authentication, and object-store path.

### Pre-requisites

The following must exist on the Teradata side **before** running dbt:

* A `DATALAKE` object created in Teradata (e.g. via `CREATE DATALAKE my_lake ...`). dbt does not create DATALAKEs.
* A database inside that DATALAKE (the "OTF database") that will hold the OTF tables. dbt does not create this either.
* The dbt user must have permission to `CREATE TABLE` / `DROP TABLE` within the OTF database, and `SELECT` permission to read OTF tables defined elsewhere.

Refer to the Teradata documentation for `CREATE DATALAKE` syntax and the specific permissions required for your catalog backend.

### Configuration

Register the catalog integration in a `catalogs.yml` file at your dbt project root:

```yaml
catalogs:
- name: my_otf_catalog
active_write_integration: td_datalake
write_integrations:
- name: td_datalake
catalog_type: datalake
adapter_properties:
datalake_name: my_lake # the pre-created DATALAKE object
otf_database: my_otf_db # the pre-created OTF database within it
```

`catalog_type` must be `datalake`. `datalake_name` and `otf_database` are both required and validated at integration registration time.

Reference the catalog from a model via `catalog_name`:

```sql
-- models/sales_iceberg.sql
{{ config(
materialized='table',
catalog_name='my_otf_catalog',
partitioned_by='YEAR(order_date), country',
sorted_by='customer_id ASC',
tblproperties="'write.format.default'='parquet', 'gc.enabled'='true'",
purge_mode='NO PURGE'
) }}
select
order_id,
customer_id,
country,
order_date,
amount
from {{ ref('stg_orders') }}
```

### Naming conventions: 2-part vs 3-part

Teradata's native objects use **2-part** naming (`database.object`); in dbt-teradata, the `database` field is unused and the `schema` field carries the Teradata database name. OTF tables are the **only** Teradata objects that use **3-part** naming (`<datalake>."<otf_database>"."<table>"`).

For OTF tables, dbt-teradata maps:

| dbt field | Teradata concept |
| ------------ | --------------------------------- |
| `database` | DATALAKE name (unquoted) |
| `schema` | OTF database name (quoted) |
| `identifier` | OTF table name (quoted) |

When you set `catalog_name` on a model, dbt-teradata pulls `database` and `schema` from the registered catalog integration automatically. For an OTF table defined as a **source** (where there is no `catalog_name` model config), declare the `database` and `schema` explicitly in `sources.yml` — the adapter detects the 3-part shape (database ≠ schema) and renders it correctly:

```yaml
version: 2
sources:
- name: customer_otf
database: my_lake # DATALAKE name
schema: my_otf_db # OTF database name
tables:
- name: customer_iceberg
```

A `ref()` from another model then compiles to `my_lake."my_otf_db"."customer_iceberg"`.

### Supported model config options

| Option | Type | Description |
| ---------------- | ------- | ---------------------------------------------------------------------------------------------------------- |
| `catalog_name` | string | Name of the catalog integration from `catalogs.yml`. Required to mark a model as OTF. |
| `partitioned_by` | string | Iceberg/Delta partition expression, e.g. `'YEAR(dt), country'`. |
| `sorted_by` | string | Sort order, e.g. `'id ASC'`. |
| `tblproperties` | string | Iceberg/Delta table properties, e.g. `"'gc.enabled'='true'"`. |
| `purge_mode` | string | DROP behavior. `'NO PURGE'` (default; removes catalog entry only) or `'PURGE ALL'` (also deletes data files on the object store). Case-insensitive. |

`persist_docs` and standard dbt cache management work on OTF models the same way they do on native tables. **`grants` is not supported on OTF tables** — Teradata does not allow `GRANT` on DATALAKE objects (access control is managed via AUTHORIZATION objects and external IAM/OAuth policies). Setting `grants` on an OTF model emits a warning and is otherwise ignored.

### Limitations and trade-offs

* **Non-atomic re-materialization.** OTF tables use 3-part naming that cannot be renamed via standard DDL, so the adapter cannot use the build-tmp-then-rename pattern that protects native tables on a failed CREATE. An OTF model is dropped before it is re-created — if the CREATE fails, the table is gone. Plan for `--full-refresh` workflows accordingly.
* **Model contracts are not supported on the OTF path.** Setting `contract.enforced: true` together with `catalog_name` raises a compile-time error.
* **Teradata-native table options are not supported.** Setting any of `table_kind`, `table_option`, `with_statistics`, or `index` together with `catalog_name` raises a compile-time error — these options describe native Teradata table storage and do not apply to Iceberg/Delta tables.
* **Only `catalog_type: datalake` is supported.** Other catalog types are rejected with a compile-time error.

### Error 7825 suppression

Teradata raises error 7825 ("OTF table not found in external catalog") when `DROP TABLE` is issued against an OTF table that no longer exists in the external catalog (e.g. Glue). dbt-teradata treats 7825 the same way it treats native errors 3807/3853/3854 — suppressed under `IF EXISTS` semantics — so re-running a dbt project after an OTF table has been deleted externally does not fail.

### Testing OTF locally

Functional tests for the OTF feature live in `tests/functional/adapter/test_otf_integration.py` and are gated on the following environment variables:

```bash
export DBT_TERADATA_DATALAKE='my_lake' # pre-created DATALAKE name
export DBT_TERADATA_OTF_DATABASE='my_otf_db' # pre-created OTF database name
```

Combined with the standard `DBT_TERADATA_SERVER_NAME` / `DBT_TERADATA_USERNAME` / `DBT_TERADATA_PASSWORD` connection variables, `pytest tests/functional/adapter/test_otf_integration.py` will exercise: basic create, idempotency, cross-model `ref()`, source-based 3-part naming, grants warning, `purge_mode: 'NO PURGE'`, and compile-time guardrails. Without the OTF env vars set, all OTF integration tests are skipped.

Pure unit tests (no Teradata required) live in `tests/unit/test_otf_catalogs.py` and run via `pytest tests/unit/`.

## temporary_metadata_generation_schema (earlier fallback_schema)
dbt-teradata internally created temporary tables to fetch the metadata of views for manifest and catalog creation.
In case if user does not have permission to create tables on the schema they are working on, they can define a temporary_metadata_generation_schema(to which they have proper create and drop privileges) in dbt_project.yml as variable.
Expand Down
142 changes: 142 additions & 0 deletions dbt/adapters/teradata/catalogs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
"""Catalog integrations for Teradata Open Table Format (OTF) support.

This module provides the DATALAKE catalog integration that enables dbt to
create and manage Iceberg/Delta Lake tables via Teradata's native OTF support.
Tables are addressed using 3-part naming: <datalake>."<otf_database>"."<table>".

Configuration in catalogs.yml:

catalogs:
- name: my_catalog
write_integrations:
- name: iceberg_glue
catalog_type: datalake
adapter_properties:
datalake_name: MyOTFLake # pre-created DATALAKE object
otf_database: my_otf_db # pre-created database within the DATALAKE

Model-level config options (set via {{ config(...) }} in .sql files):

partitioned_by -- Iceberg partition expression, e.g. 'YEAR(dt), country'
sorted_by -- Iceberg sort order, e.g. 'id ASC'
tblproperties -- Iceberg table properties, e.g. "'gc.enabled'='true'"
purge_mode -- DROP behavior: 'NO PURGE' (default) or 'PURGE ALL'
NO PURGE = remove catalog entry only, keep data files (safe default)
PURGE ALL = remove catalog entry AND delete data files on object store
"""

from dataclasses import dataclass
from typing import Optional

from dbt.adapters.catalogs import (
CatalogIntegration,
CatalogIntegrationConfig,
CatalogRelation,
InvalidCatalogIntegrationConfigError,
)
from dbt.adapters.contracts.relation import RelationConfig


# ---------------------------------------------------------------------------
# Relation dataclass -- carries catalog metadata into Jinja macros
# ---------------------------------------------------------------------------

@dataclass
class TeradataCatalogRelation(CatalogRelation):
"""Relation metadata for DATALAKE-based OTF tables.

Fields populated from catalogs.yml (via the integration):
catalog_type, catalog_name, table_format, file_format,
external_volume, datalake_name, otf_database

Fields populated from model config (via build_relation):
partitioned_by, sorted_by, tblproperties, purge_mode
"""
catalog_type: Optional[str] = None
catalog_name: Optional[str] = None
table_format: Optional[str] = None
file_format: Optional[str] = None
external_volume: Optional[str] = None
datalake_name: Optional[str] = None
otf_database: Optional[str] = None
partitioned_by: Optional[str] = None
sorted_by: Optional[str] = None
tblproperties: Optional[str] = None
# Controls DROP TABLE behavior for OTF tables:
# 'NO PURGE' -- removes catalog entry only, data files remain (default; safe)
# 'PURGE ALL' -- removes catalog entry AND deletes data files on object store
purge_mode: Optional[str] = None


# ---------------------------------------------------------------------------
# DATALAKE integration -- 3-part naming (datalake."otf_db"."table")
# ---------------------------------------------------------------------------

class TeradataDatalakeCatalogIntegration(CatalogIntegration):
"""Catalog integration for Teradata DATALAKE objects.

Works with any external catalog (AWS Glue, Unity Catalog, etc.) because
Teradata's DATALAKE object encapsulates the catalog type, auth, and
object store path. dbt only needs the datalake_name and otf_database
for 3-part naming.

Generated SQL example:
DROP TABLE /*+ IF EXISTS */ <datalake>."<otf_db>"."<table>" PURGE ALL;
CREATE TABLE <datalake>."<otf_db>"."<table>"
PARTITIONED BY (...)
SORTED BY ...
TBLPROPERTIES(...)
AS (...) WITH DATA;

Required adapter_properties in catalogs.yml:
datalake_name -- name of the pre-created DATALAKE object in Teradata
otf_database -- name of the pre-created database within the DATALAKE
"""

catalog_type = "datalake"
allows_writes = True
table_format = "iceberg"
file_format = "parquet"

def __init__(self, config: CatalogIntegrationConfig) -> None:
super().__init__(config)
# Restore class-level defaults if not provided in config
if config.file_format is None:
self.file_format = "parquet"
adapter_props = config.adapter_properties or {}
self.datalake_name = adapter_props.get("datalake_name")
self.otf_database = adapter_props.get("otf_database")
if not self.datalake_name:
raise InvalidCatalogIntegrationConfigError(
config.name,
"adapter_properties.datalake_name is required -- "
"it must match the pre-created DATALAKE object in Teradata",
)
if not self.otf_database:
raise InvalidCatalogIntegrationConfigError(
config.name,
"adapter_properties.otf_database is required -- "
"it must match the pre-created OTF database within the DATALAKE",
)

def build_relation(self, config: RelationConfig) -> TeradataCatalogRelation:
"""Build a TeradataCatalogRelation from integration + model config.

Integration-level fields (datalake_name, otf_database, etc.) come from
catalogs.yml. Model-level fields (partitioned_by, sorted_by,
tblproperties, purge_mode) come from the model's {{ config(...) }}.
"""
model_config = config.config if hasattr(config, 'config') and config.config else {}
return TeradataCatalogRelation(
catalog_type=self.catalog_type,
catalog_name=self.catalog_name,
table_format=self.table_format,
file_format=self.file_format,
external_volume=self.external_volume,
datalake_name=self.datalake_name,
otf_database=self.otf_database,
partitioned_by=model_config.get("partitioned_by"),
sorted_by=model_config.get("sorted_by"),
tblproperties=model_config.get("tblproperties"),
purge_mode=model_config.get("purge_mode"),
)
24 changes: 13 additions & 11 deletions dbt/adapters/teradata/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,20 +431,22 @@ def add_query(
try:
return SQLConnectionManager.add_query(self, sql, auto_begin, bindings, abridge_sql_log)
except Exception as ex:
ignored = False
query = sql.strip()
if ("DROP view /*+ IF EXISTS */" in query) or ("DROP table /*+ IF EXISTS */" in query):
for error_number in [3807, 3854, 3853]:
if f"[Error {error_number}]" in str (ex):
ignored = True
query_upper = sql.strip().upper()
if ("DROP VIEW /*+ IF EXISTS */" in query_upper) or ("DROP TABLE /*+ IF EXISTS */" in query_upper):
# 3807 = object does not exist (standard Teradata)
# 3854 = table does not exist (standard Teradata)
# 3853 = view does not exist (standard Teradata)
# 7825 = OTF table not found in external catalog (e.g. Glue/Unity)
# raised by ICEBERG_EXPORT UDF when dropping a nonexistent
# DATALAKE table; safe to ignore under IF EXISTS semantics
for error_number in [3807, 3854, 3853, 7825]:
if f"[Error {error_number}]" in str(ex):
return None, None
if ("DELETE DATABASE /*+ IF EXISTS */" in query) or ("DROP DATABASE /*+ IF EXISTS */" in query):
if ("DELETE DATABASE /*+ IF EXISTS */" in query_upper) or ("DROP DATABASE /*+ IF EXISTS */" in query_upper):
for error_number in [3802]:
if f"[Error {error_number}]" in str (ex):
ignored = True
if f"[Error {error_number}]" in str(ex):
return None, None
if not ignored:
raise # rethrow
raise # rethrow

# this method will return the datatype as string
@classmethod
Expand Down
5 changes: 5 additions & 0 deletions dbt/adapters/teradata/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from dbt.adapters.teradata import TeradataConnectionManager
from dbt.adapters.teradata import TeradataRelation
from dbt.adapters.teradata import TeradataColumn
from dbt.adapters.teradata.catalogs import TeradataDatalakeCatalogIntegration
from dbt.adapters.capability import CapabilityDict, CapabilitySupport, Support, Capability
from dbt.adapters.base.meta import available
from dbt.adapters.base import BaseRelation
Expand Down Expand Up @@ -55,6 +56,10 @@ class TeradataAdapter(SQLAdapter):
Column = TeradataColumn
ConnectionManager = TeradataConnectionManager

CATALOG_INTEGRATIONS = [
TeradataDatalakeCatalogIntegration,
]

CONSTRAINT_SUPPORT = {
ConstraintType.check: ConstraintSupport.ENFORCED,
ConstraintType.not_null: ConstraintSupport.ENFORCED,
Expand Down
Loading
Loading