Skip to content

Add Federate and Govern Iceberg Tables using Snowpark Connect for Apache Spark quickstart#3416

Open
sfc-gh-ncherukuri wants to merge 2 commits into
Snowflake-Labs:masterfrom
sfc-gh-ncherukuri:add-iceberg-scos-assets
Open

Add Federate and Govern Iceberg Tables using Snowpark Connect for Apache Spark quickstart#3416
sfc-gh-ncherukuri wants to merge 2 commits into
Snowflake-Labs:masterfrom
sfc-gh-ncherukuri:add-iceberg-scos-assets

Conversation

@sfc-gh-ncherukuri

Copy link
Copy Markdown
Contributor

Summary

  • Adds a new quickstart guide: Federate and Govern Iceberg Tables using Snowpark Connect for Apache Spark
  • Demonstrates two scenarios: (1) SF-managed Iceberg tables with Snowpark Connect + Horizon governance, (2) Federated Databricks Unity Catalog Iceberg tables via Catalog Integration + Cortex AI enrichment pipeline
  • Includes 7 fully tested, idempotent assets (SQL setup, Python/Databricks notebooks, Snowflake workspace notebooks)

Assets

File Purpose
01_sf_iceberg_catalog_setup.sql Snowflake setup: external volumes, Iceberg tables, masking policies, roles
02_sf_iceberg_demo.ipynb Snowpark Connect governance demo on SF-managed Iceberg tables
03_databricks_rw_sf_iceberg.py Databricks notebook: read/write SF-managed Iceberg via Horizon IRC
04_databricks_create_uc_tables.py Databricks notebook: create Unity Catalog Delta/Iceberg tables
05_databricks_federation_demo.ipynb Snowpark Connect federation + governance demo on Databricks UC tables
06_cortex_ai_pipeline.ipynb Snowflake notebook: 3-phase Cortex AI enrichment + Cortex Analyst semantic view
07_ai_pipeline.ipynb Snowpark Connect AI pipeline: federated Iceberg → Cortex enrichment → SF-managed Iceberg

Test plan

  • All SQL/notebook assets verified idempotent (safe to re-run consecutively without errors)
  • Cortex error 370001 resolved via 3-phase approach (stage → CTAS → UPDATE)
  • Masking policy re-run handled via IF NOT EXISTS + try-except UNSET pattern
  • Guide renders correctly on localhost:8000 via gulp serve
  • All file references in the .md match actual asset filenames (01–07)

.... Generated with Cortex Code

…e demo, scenario renumber

- Cell 12: replace CTAS write with two-phase approach (stage to temp table →
  CTAS Iceberg) to avoid error 370001; all refs parameterized, TIMESTAMP_LTZ(6)
- Cell 14: replace governance demo with tested deployed version
- Cell 1: SCENARIO 4 → SCENARIO 2

.... Generated with [Cortex Code](https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code)

Co-Authored-By: Cortex Code <noreply@snowflake.com>
@github-actions

Copy link
Copy Markdown
Contributor

👋 Helpful Information

💡 Non-Image Files in Assets Folder

Non-image files found in /assets folders will NOT be uploaded to snowflake.com. If you are referencing them in your guide, you should link to them directly (e.g. using permanent GitHub links).

  • site/sfguides/src/federate-and-govern-iceberg-tables-using-snowpark-connect-for-apache-spark/assets/07_ai_pipeline.ipynb

These are informational messages and will not block your PR from being merged.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

👋 Helpful Information

💡 Non-Image Files in Assets Folder

Non-image files found in /assets folders will NOT be uploaded to snowflake.com. If you are referencing them in your guide, you should link to them directly (e.g. using permanent GitHub links).

  • site/sfguides/src/federate-and-govern-iceberg-tables-using-snowpark-connect-for-apache-spark/assets/07_ai_pipeline.ipynb

These are informational messages and will not block your PR from being merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants