diff --git a/CHANGELOG.md b/CHANGELOG.md
index d4ab923..17eb554 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,6 +4,17 @@ All notable changes to this project are documented here. The format follows [Kee
## [Unreleased]
+### Added — `@craftstack/data-analytics-demo` 0.1.0 (Spec-Driven Stage 4, T-01〜T-14)
+
+Polyglot package (Python + a single JS sub-package member) shipping a local-only customer-analytics demo end-to-end. Built across PRs #82 #83 #86 #87 #88 #89 #90 #91 (this PR is #92, the docs + changelog close-out). [ADR-0070](docs/adr/0070-data-analytics-demo-polyglot-adoption.md) (with the 2026-05-18 dashboard pivot amendment) records the design and the Evidence-vs-Python-Jinja2-Plotly tradeoff.
+
+- **Six pipeline layers**: data generation (Faker + numpy + DuckDB), dbt transformation (staging / intermediate / marts), ML (LogisticRegression + XGBoost churn — ROC-AUC ≥ 0.70 floor; LogReg upsell propensity — lift @ top-10% ≥ 1.5× floor), local-LLM narrative (Ollama, AC-4.3 cloud-credential guard), self-built static-HTML dashboard (Jinja2 + Plotly), MetricFlow KPI semantic layer with pure-Python validator.
+- **CI infrastructure**: `.github/workflows/python-test.yml` (ruff + mypy --strict + pytest with 80 % coverage floor); `.github/workflows/python-audit.yml` (pip-audit `--strict` against OSV); Dependabot `pip` ecosystem grouped by dbt / ml / duckdb / dev.
+- **Security mitigations**: `duckdb >= 1.4.2` pin (CVE-2025-64429), no external API credentials anywhere, all generated artifacts gitignored, every dependency listed in ADR-0070 with literal license + maintenance verification.
+- **Test surface**: 36 pytest cases (data / dbt / ml-churn / ml-upsell / narrative / dashboard / semantic / e2e); coverage 87.20 %.
+
+The package becomes the seventh `packages/*` entry and the monorepo's first Python sub-tree.
+
## [0.5.19] — 2026-04-29
### Changed — Run #6 hiring-sim findings closure + deploy-visible-surface coverage extension (ADR-0069)
diff --git a/HANDOFF.md b/HANDOFF.md
index 200ebe3..6015c6e 100644
--- a/HANDOFF.md
+++ b/HANDOFF.md
@@ -4,28 +4,17 @@ Tracks ephemeral in-progress state between AI-assisted sessions. For shipped sta
## Current
-- **last session**: 2026-05-17
-- **status**: stable on main; opacity-sanitize + handoff infra shipped (PR #79 + #80)
-- **active work item**: data analytics demo package (planning phase — prior art scan done, scaffold not yet started)
-- **next planned**: Spec-Driven Stage 1 Discovery for `packages/data-analytics-demo/`
+- **last session**: 2026-05-18
+- **status**: stable on main; `@craftstack/data-analytics-demo` 0.1.0 shipped (PRs #82 #83 #86 #87 #88 #89 #90 #91 #92)
+- **active work item**: none in progress
+- **next planned**: TBD — pipeline complete and reproducible via `make demo` inside `packages/data-analytics-demo/`
- **blockers**: none
-### Planned package — data-analytics-demo
+### Shipped — data-analytics-demo (2026-05-18)
-Customer-behavior / SaaS-style analytics demo for portfolio. Constraints: local-only (no credit card), local LLM (Ollama), synthetic data only.
+Local-only SaaS customer-analytics demo. Six pipeline layers (data / dbt / ml / narrative / dashboard / semantic) plus polyglot CI infrastructure. See [ADR-0070](docs/adr/0070-data-analytics-demo-polyglot-adoption.md) for the design and the dashboard pivot (Evidence → self-built Python + Jinja2 + Plotly).
-Verified prior-art seeds (license + maintenance literal-checked 2026-05-17):
-
-| seed | license | role |
-|---|---|---|
-| dbt-labs/jaffle_shop_duckdb (default branch: `duckdb`) | Apache 2.0 | dbt project skeleton (staging/marts 2-tier pattern) |
-| evidence-dev/evidence | MIT | BI-as-code dashboard (SQL fenced in markdown) |
-| dbt-labs/metricflow | Apache 2.0 | semantic layer YAML (single KPI definition) |
-| duckdb/duckdb (tpcds extension) | MIT | synthetic SaaS data via `CALL dsdgen(sf=1)` |
-| ollama/ollama (Llama 3.1 8B Instruct) | MIT | local LLM for SHAP→narrative |
-| Python in Plain English (Faker+DuckDB+sklearn article, 2025-09) | technique reference | churn pipeline pattern (no code clone) |
-
-Rejected: `dbt-labs/jaffle-shop-template` (no LICENSE + 2.5y unmaintained).
+Quickstart: `cd packages/data-analytics-demo && make install && ollama serve & && make demo`.
## Update protocol
@@ -45,9 +34,9 @@ When starting a session:
## What lives where
-| Information | Location | Lifetime |
-|---|---|---|
-| Ephemeral in-progress state | this file | days–weeks |
-| Decisions (why we chose X) | [docs/adr/](docs/adr/) | permanent |
-| Shipped feature log | [CHANGELOG.md](CHANGELOG.md) | permanent |
-| Conventions & rules for AI | [apps/*/AGENTS.md](apps/) | permanent |
+| Information | Location | Lifetime |
+| --------------------------- | ---------------------------- | ---------- |
+| Ephemeral in-progress state | this file | days–weeks |
+| Decisions (why we chose X) | [docs/adr/](docs/adr/) | permanent |
+| Shipped feature log | [CHANGELOG.md](CHANGELOG.md) | permanent |
+| Conventions & rules for AI | [apps/\*/AGENTS.md](apps/) | permanent |
diff --git a/packages/data-analytics-demo/README.md b/packages/data-analytics-demo/README.md
index d653138..c8f31cb 100644
--- a/packages/data-analytics-demo/README.md
+++ b/packages/data-analytics-demo/README.md
@@ -1,51 +1,77 @@
# @craftstack/data-analytics-demo
-> **Status**: Phase 0 scaffold (T-01 / T-02 complete). Pipeline stages T-03 onward are placeholders that exit 1 with a TODO message. See [ADR-0070](../../docs/adr/0070-data-analytics-demo-polyglot-adoption.md) for the design.
+Customer-analytics demo for a SaaS-style data set: synthetic data → SQL marts (dbt) → ML (churn + upsell) → narrative (local LLM via Ollama) → BI dashboard (self-built static HTML) → KPI semantic layer (MetricFlow). All seven layers run on a developer laptop, no credit card, no cloud-LLM API calls.
-Local-only SaaS customer-analytics demo: synthetic data → SQL marts (dbt) → ML (churn + upsell) → narrative (local LLM via Ollama) → BI dashboard (Evidence) → KPI semantic layer (MetricFlow).
+## Why it exists
-## Constraints (load-bearing — see ADR-0070)
+It is the portfolio answer to a data-analyst job description that explicitly names three axes:
-- **Zero credit card** — no Snowflake, BigQuery, Anthropic, OpenAI, or any paid service. Synthetic data only.
-- **Local LLM only** — narrative generation runs against a local Ollama server. No external network calls.
-- **Consumer laptop** — designed to complete `make demo` on a developer laptop in under 5 minutes.
-- **Synthetic data only** — no real customer PII. Faker + DuckDB tpcds generate everything.
+1. **Advanced SQL + statistical modelling** — SQL marts and propensity models for churn and upsell.
+2. **Business-strategy narratives** — an executive brief generated from the model's own SHAP feature importances.
+3. **BI enablement** — a single source of truth (MetricFlow KPI definitions) plus a static dashboard built from the same marts.
-## Quickstart
+A recruiter cloning this repo can run `make demo` and read all three deliverables in under five minutes.
+
+## Quickstart (5 commands)
```bash
-# 1. Install the package (editable, with dev extras)
-make install
+make install # editable install + dev extras
+ollama serve & # start local Ollama
+ollama pull llama3.1:8b-instruct-q4_K_M # or set OLLAMA_MODEL to a model already pulled
+make demo # data → dbt → ml → narrative → dashboard → semantic
+open dashboard/build/index.html # (or your platform equivalent)
+```
+
+`make demo` runs the full chain with a visible banner per stage. Any stage failure halts the chain with a non-zero exit code (AC-α.2).
+
+## Layout
+
+| Path | Role |
+| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `pyproject.toml` | Python package definition + pinned deps (`duckdb >= 1.4.2` mitigates [CVE-2025-64429](https://github.com/duckdb/duckdb/security/advisories/GHSA-vmp8-hg63-v2hp)) |
+| `package.json` | pnpm workspace member (scripts proxy to `make`) |
+| `Makefile` | Single user-facing entry point — every stage has a target; `make demo` chains all six |
+| `src/data_analytics_demo/` | Python source ([data](src/data_analytics_demo/data), [ml](src/data_analytics_demo/ml), [narrative](src/data_analytics_demo/narrative), [dashboard](src/data_analytics_demo/dashboard), [semantic](src/data_analytics_demo/semantic)) |
+| `dbt_project/` | dbt project (staging / intermediate / marts; uses `dbt-duckdb`) |
+| `semantic/kpi.yml` | MetricFlow-compatible semantic models + KPI metrics (single source of truth) |
+| `warehouse/` | Generated DuckDB file (gitignored) |
+| `ml/artifacts/` | Generated model + SHAP outputs (gitignored) |
+| `narrative/output.md` | Generated LLM narrative (gitignored) |
+| `dashboard/build/` | Generated static HTML site (gitignored) |
+| `tests/` | pytest suite — one file per layer plus an end-to-end test |
+| `docs/architecture.md` | Pipeline diagram + per-layer details |
-# 2. Make sure Ollama is running locally and the model is pulled
-ollama serve &
-ollama pull llama3.1:8b-instruct-q4_K_M
+## Architecture (one-line summary per layer)
-# 3. Run the full pipeline
-make demo
+```
+data Faker + numpy synthesise 1000 customers / 50 000 events / 2000 subscriptions / 5000 invoices into DuckDB
+dbt staging (4 views) → intermediate (2 views) → marts (rfm_segments, churn_features, upsell_opportunities, cohort_retention)
+ml LogisticRegression baseline + XGBoost on churn (ROC-AUC ≥ 0.70) + LogisticRegression on upsell (lift @ top-10% ≥ 1.5×) + SHAP summary
+narrative Local Ollama (llama3.1:8b-instruct by default; OLLAMA_MODEL env-var overridable) generates an executive markdown brief from the SHAP summary
+dashboard Self-built Python generator (Jinja2 + Plotly via CDN) emits 4 static HTML pages from the marts
+semantic MetricFlow YAML — 3 semantic models, 4 KPI metrics; structural invariants enforced by the validator
```
-`make demo` chains: `data → dbt → ml → narrative → dashboard`. Any stage failure halts the pipeline with a non-zero exit code.
+See [docs/architecture.md](docs/architecture.md) for the pipeline diagram and per-layer details.
-## Layout
+## Constraints (load-bearing — see [ADR-0070](../../docs/adr/0070-data-analytics-demo-polyglot-adoption.md))
+
+- **Zero credit card.** No Snowflake / BigQuery free trial; no Anthropic / OpenAI / Gemini API.
+- **Local LLM only.** Narrative generation runs against a local Ollama; the module asserts the absence of cloud-LLM credentials at invocation time (AC-4.3).
+- **Consumer laptop.** End-to-end completes well under five minutes at the default seed sizing.
+- **Synthetic data only.** No real PII anywhere; Faker `company_email()` / `company()` generate everything.
+
+## Engineered ML signals (so the models have something to learn)
+
+- **Churn**: customers without an active subscription get 4× lower event weight, and their timestamps are biased into the older half of the history window — `recent_to_lifetime_ratio` in `churn_features` correlates with the cancel label.
+- **Upsell**: `feature_use_premium` / `feature_use_advanced` event distributions skew higher for paid tiers — `premium_event_count` in `upsell_opportunities` correlates with the upgrade label.
-| Path | Role |
-| -------------------------- | --------------------------------------------------------------------------- |
-| `pyproject.toml` | Python package definition + pinned deps (DuckDB ≥ 1.4.2 for CVE-2025-64429) |
-| `package.json` | pnpm workspace member (script proxies to Makefile) |
-| `Makefile` | Single entry point — every stage has a target |
-| `src/data_analytics_demo/` | Python source (data gen, ML, narrative) |
-| `dbt_project/` | dbt project (staging / intermediate / marts) |
-| `dashboard/` | Evidence BI sub-project (static HTML build) |
-| `semantic/` | MetricFlow KPI definitions |
-| `warehouse/` | Generated DuckDB file lives here (gitignored) |
-| `ml/artifacts/` | Generated model + SHAP outputs (gitignored) |
-| `tests/` | pytest suite covering each layer |
+Both signals are observable through SQL alone (no leak from the data generator into the ML feature surface).
## Prior art (pattern extraction only, no clone)
-Six OSS projects supply the design pattern; everything is reimplemented from scratch in this package. License + maintenance verified 2026-05-17. See ADR-0070 for the full table including a rejected candidate.
+Six OSS projects supplied the design pattern; everything is reimplemented from scratch in this package. License + maintenance literal-verified 2026-05-17. See [ADR-0070](../../docs/adr/0070-data-analytics-demo-polyglot-adoption.md) for the full table including a rejected candidate and the 2026-05-18 dashboard pivot.
## License
-MIT — same as the craftstack monorepo.
+MIT — same as the rest of the craftstack monorepo.
diff --git a/packages/data-analytics-demo/docs/architecture.md b/packages/data-analytics-demo/docs/architecture.md
new file mode 100644
index 0000000..fc5eae5
--- /dev/null
+++ b/packages/data-analytics-demo/docs/architecture.md
@@ -0,0 +1,133 @@
+# Architecture — `@craftstack/data-analytics-demo`
+
+End-to-end pipeline view. The same DuckDB file (`warehouse/analytics.duckdb`) is the load-bearing artifact — every layer either writes to it or reads from it.
+
+## Pipeline
+
+```mermaid
+flowchart LR
+ subgraph generate["1. Data generation (Faker + numpy)"]
+ G[generate.py] -->|customers / subscriptions / events / invoices| DB[(analytics.duckdb)]
+ end
+
+ subgraph dbt_layer["2. dbt transformation"]
+ DB --> STG[staging views × 4]
+ STG --> INT[intermediate views × 2]
+ INT --> MARTS["marts × 4
rfm_segments
churn_features
upsell_opportunities
cohort_retention"]
+ MARTS --> DB
+ end
+
+ subgraph ml_layer["3. ML pipelines"]
+ DB -->|churn_features| CHURN[churn.py
LogReg + XGBoost]
+ DB -->|upsell_opportunities| UPSELL[upsell.py
LogReg propensity]
+ CHURN --> ART[ml/artifacts/
model.pkl + shap_summary.json]
+ UPSELL --> ART
+ end
+
+ subgraph narrative_layer["4. Narrative (local LLM)"]
+ ART -->|shap_summary.json| NARR[ollama_client + prompts]
+ NARR --> OLL[(Ollama @ localhost:11434)]
+ OLL --> NARR
+ NARR --> NMD[narrative/output.md]
+ end
+
+ subgraph dashboard_layer["5. Dashboard (Python + Jinja2 + Plotly)"]
+ DB -->|marts| DASH[render.py + templates]
+ DASH --> HTML[dashboard/build/
index / rfm / churn / kpi]
+ end
+
+ subgraph semantic_layer["6. Semantic layer (MetricFlow)"]
+ KPI[semantic/kpi.yml] --> VAL[validator.py]
+ VAL --> REP[ValidationReport]
+ end
+
+ classDef store fill:#fff4e1,stroke:#a16207;
+ classDef art fill:#e1f5ff,stroke:#0369a1;
+ classDef out fill:#e7ffe7,stroke:#16a34a;
+ class DB,OLL store;
+ class STG,INT,MARTS,ART art;
+ class NMD,HTML,REP out;
+```
+
+## Layer details
+
+### 1. Data generation — `src/data_analytics_demo/data/`
+
+| File | Role |
+| ------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
+| `schemas.py` | Pydantic models (`Customer`, `Subscription`, `Event`, `Invoice`) define the column shapes the dbt sources expect. |
+| `generate.py` | Faker (names / emails / companies) + numpy (deterministic distributions) → 4 tables in DuckDB. Seed lives in `DEMO_RANDOM_SEED` (default 42). |
+
+The generator deliberately injects two patterns so the ML layer has something to learn:
+
+- **Churn**: customers without active subscriptions get 4× lower event weight, and their timestamps are biased into the older half of the history window (so `recent_to_lifetime_ratio` carries signal).
+- **Upsell**: `feature_use_premium` / `feature_use_advanced` events skew higher for paid tiers.
+
+### 2. dbt transformation — `dbt_project/`
+
+Standard staging → intermediate → marts layout, profiled to a local DuckDB. Marts are materialised as tables (the ML / dashboard layers read them); staging and intermediate stay as views.
+
+| Mart | Grain | Purpose |
+| ---------------------- | ----------------------------------------------- | ------------------------------------------------------ |
+| `rfm_segments` | one row per active customer | R / F / M quintile scoring + 5-bucket label |
+| `churn_features` | one row per customer | Feature table for churn prediction; `is_churned` label |
+| `upsell_opportunities` | one row per free/pro customer | Feature table for upsell propensity; `upgraded` label |
+| `cohort_retention` | one row per (cohort_month, months_since_signup) | Monthly retention grid |
+
+20 schema tests (not_null, unique, accepted_values) enforce the contract on the marts.
+
+### 3. ML — `src/data_analytics_demo/ml/`
+
+| File | Role |
+| ------------ | ------------------------------------------------------------------------------------------------------------------------------------ |
+| `_io.py` | Shared mart loader; raises clear errors when warehouse / mart is missing. |
+| `churn.py` | Trains a LogisticRegression baseline AND an XGBoost classifier on `churn_features`, picks the higher hold-out ROC-AUC (floor: 0.70). |
+| `upsell.py` | LogisticRegression propensity on `upsell_opportunities`; measures hold-out ROC-AUC and lift @ top-10% (floor: 1.5×). |
+| `explain.py` | SHAP wrapper. TreeExplainer first; falls back to model-agnostic. |
+
+Determinism: every random-number-using step takes `random_state=42`. Re-running with the same seed produces byte-identical artifacts.
+
+### 4. Narrative — `src/data_analytics_demo/narrative/`
+
+| File | Role |
+| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `ollama_client.py` | Env-var-gated host + model resolution; runtime assertion that no cloud-LLM credentials are present (AC-4.3). |
+| `prompts.py` | Executive-brief prompt template (SHAP summary → prompt is the only call point). |
+| `generate.py` | Reads `shap_summary.json`, calls Ollama, wraps the body with provenance metadata (model id, source path, timestamp, "External LLM calls: 0" advertisement). |
+
+### 5. Dashboard — `src/data_analytics_demo/dashboard/`
+
+| File | Role |
+| --------------------- | ----------------------------------------------------------------------------------------------------------------- |
+| `render.py` | Reads marts via DuckDB, calls into `charts.py` and renders Jinja2 templates. |
+| `queries.py` | Centralised SQL queries against the marts. |
+| `charts.py` | Plotly figure builders (bar / scatter / line / area / heatmap). CDN-served plotly.js keeps per-page size ≤ 40 KB. |
+| `templates/*.html.j2` | Base layout + index / rfm / churn / kpi pages. |
+
+The original design used Evidence (MIT) but its SvelteKit-based build chain hit four+ chained peer-dependency failures under pnpm 10's isolated layout. The amendment in [ADR-0070](../../../docs/adr/0070-data-analytics-demo-polyglot-adoption.md) documents the pivot.
+
+### 6. Semantic layer — `src/data_analytics_demo/semantic/` + `semantic/kpi.yml`
+
+| Asset | Role |
+| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `semantic/kpi.yml` | MetricFlow-compatible semantic models (customers / subscriptions / invoices) + metrics (customers, active_subscriptions, mrr, paid_invoice_volume). |
+| `validator.py` | Pure-Python validator. Enforces required keys, non-empty dims/measures, cross-references. Independent of the MetricFlow CLI so the test suite has no shell dependency. |
+
+## Files produced by a full `make demo` run
+
+```
+warehouse/analytics.duckdb ← stages 1 + 2 + 5
+ml/artifacts/churn_model.pkl ← stage 3 (churn)
+ml/artifacts/churn_metadata.json
+ml/artifacts/shap_summary.json
+ml/artifacts/upsell_model.pkl ← stage 3 (upsell)
+ml/artifacts/upsell_metadata.json
+ml/artifacts/upsell_lift_report.json
+narrative/output.md ← stage 4
+dashboard/build/index.html ← stage 5
+dashboard/build/rfm.html
+dashboard/build/churn.html
+dashboard/build/kpi.html
+```
+
+All output paths are gitignored — only the source code, dbt SQL, semantic YAML, templates, and tests are tracked.