Skip to content

feat: add realtime-box with Lookup Catalog sync workflows#440

Closed
toru-takahashi wants to merge 1 commit into
treasure-data:mainfrom
toru-takahashi:feat/realtime-box
Closed

feat: add realtime-box with Lookup Catalog sync workflows#440
toru-takahashi wants to merge 1 commit into
treasure-data:mainfrom
toru-takahashi:feat/realtime-box

Conversation

@toru-takahashi

@toru-takahashi toru-takahashi commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

Adds a new realtime-box/ directory for Treasure AI RT 2.0 workflow templates, starting with the Lookup Catalog sync workflow.

realtime-box/lookup-catalog-sync/

Syncs tables from cdp_lookup_catalog to RT 2.0 internal storage using hash-based change detection — only changed rows are uploaded on each run.

File Description
lookup_catalog_sync.dig Main workflow — auto-discovers tables via information_schema, inits digest tables, syncs all or a single table
scripts.py Type-aware JSON payload SQL generator supporting array<varchar>, array<bigint>, array<double>, scalar float artifact fix, and NULL element preservation
queries/discover_tables.sql Discovers eligible tables, excludes _wf_* internal and legacy digest/updated tables

Key behaviours:

  • Digest table key column is derived dynamically from each table's first column (no hardcoded _key)
  • p_table_name can be set to sync a single table (useful for testing)
  • All internal/temporary tables use a consistent _wf_ prefix

Test plan

  • Run workflow against a test cdp_lookup_catalog table and verify records appear in RT 2.0
  • Verify automatic table discovery picks up all eligible tables
  • Verify incremental run uploads only changed records
  • Set p_table_name and verify single-table sync works

🤖 Generated with Claude Code

Adds realtime-box/lookup-catalog-sync with two variants:

manual/
  lookup_catalog_sync.dig    — iterates configured tables with explicit
                               column definitions
  queries/                   — SQL for digest init, extract, count, update

table-discovery/
  lookup_catalog_sync.dig    — auto-discovers tables via information_schema
  sync_table.dig             — reusable single-table sync called per table
  scripts/generate_sql.py    — type-aware JSON payload SQL generator
                               (supports array<varchar/bigint/double>,
                                float artifact fix, NULL element handling)
  queries/discover_tables.sql — excludes _wf_* internal tables

Both variants implement hash-based change detection (only changed rows
are uploaded on each run) and use the _wf_ prefix for internal tables.

Co-Authored-By: Treasure Work <291137728+treasure-work@users.noreply.github.com>
@toru-takahashi

Copy link
Copy Markdown
Member Author

Superseded by #442 — consolidated and cleaned up structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant