Load jaffle-shop Parquet files from cloud storage into Snowflake.
API:
from data import load_from_gcs, load_from_s3
results = load_from_gcs(session, schema_name="RAW")
results = load_from_s3(session, bucket="s3://bucket/path/", schema_name="RAW")
Files:
data/ingestion.py
data/sql/ingestion/*.sql
Behavior:
- GCS: Download → internal stage → COPY INTO
- S3: External stage → COPY INTO
- Schema inferred from Parquet (INFER_SCHEMA)
- Idempotent (safe to re-run)
SQL Templates:
| File |
Purpose |
create_parquet_file_format.sql |
Parquet format definition |
create_internal_stage.sql |
Internal stage for GCS downloads |
create_stage_s3_public.sql |
External S3 stage |
create_table_from_parquet.sql |
Table creation with INFER_SCHEMA |
copy_into_table.sql |
COPY INTO with MATCH_BY_COLUMN_NAME |
Load jaffle-shop Parquet files from cloud storage into Snowflake.
API:
Files:
data/ingestion.pydata/sql/ingestion/*.sqlBehavior:
SQL Templates:
create_parquet_file_format.sqlcreate_internal_stage.sqlcreate_stage_s3_public.sqlcreate_table_from_parquet.sqlcopy_into_table.sql