feat(data-analytics-demo): T-06 churn + T-07 upsell ML pipelines by leagames0221-sys · Pull Request #87 · leagames0221-sys/craftstack

leagames0221-sys · 2026-05-17T14:02:42Z

Summary

Phase 3 of the data-analytics-demo bolt-on. Trains two propensity models on the dbt marts that #86 shipped, and saves the model + SHAP summary that the next phase (T-08 LLM narrative) will consume.

What lands

Path	Role
`src/data_analytics_demo/ml/_io.py`	Shared mart loader; clear error when warehouse / mart is missing (AC-3.4)
`src/data_analytics_demo/ml/explain.py`	SHAP wrapper — TreeExplainer first, falls back to the model-agnostic Explainer; emits a `top_features` JSON the narrative layer can read directly
`src/data_analytics_demo/ml/churn.py`	LogisticRegression baseline + XGBoost; picks higher hold-out ROC-AUC; saves model + metadata + SHAP summary
`src/data_analytics_demo/ml/upsell.py`	LogisticRegression propensity; measures ROC-AUC and lift @ top-10% / top-20%; raises if lift falls below the 1.5× floor
`tests/test_ml_churn.py` (5 cases) + `tests/test_ml_upsell.py` (2 cases)	AC-3.1〜3.7 verification
Makefile `ml` + `cli.py ml`	wired up to invoke both pipelines and print headline metrics

AC coverage (seed=42, n_customers=1000)

AC	Threshold	Actual	Result
3.1	artifacts written	model.pkl + metadata.json + shap_summary.json	PASS
3.2	churn ROC-AUC ≥ 0.70	0.7448 (LR)	PASS
3.3	SHAP summary persists	top-10 features w/ direction	PASS
3.4	clear error on missing data	`FileNotFoundError` w/ remediation hint	PASS
3.5	deterministic	metrics + predictions identical across runs	PASS
3.6	upsell artifacts written	upsell_model.pkl + metadata + lift_report	PASS
3.7	lift @ top-10% ≥ 1.5×	2.81×	PASS

Generator amendment

The original churn signal in data/generate.py was under-engineered — best ROC-AUC was 0.6972, narrowly below the 0.70 floor. Reworked _generate_events:

Active customers: 4× event weight; timestamps uniform across the 2-year window.
Churned customers: 1× event weight; timestamps biased into the older half (60..730 days).

The recent_to_lifetime_ratio feature in churn_features now carries a real signal, lifting churn ROC-AUC to 0.7448.

Test-infra notes

Switched fixture from subprocess.run(["dbt", ...]) to dbt.cli.main.dbtRunner so the test works on Windows without the venv Scripts dir on PATH.
The shared mart loader opens DuckDB in default (rw) mode rather than read_only=True so it can coexist with dbt's in-process adapter — DuckDB refuses to open the same file with mismatched configurations.

Local verify

make data + make dbt + make ml — end-to-end OK
ruff + mypy --strict — OK on 10 source files
pytest — 15 passed, coverage 86.75% (≥ 80% floor)
check-doc-drift.mjs — 0 failure / 0 warning
check-adr-claims.mjs — 77/77 PASS

Test plan

Required CI checks green (existing 11 + python-test + python-audit)
No HIVE-token leaks (D-HIVE-OPACITY)

Phase 3 of the data-analytics-demo bolt-on. Trains two propensity models on the dbt marts shipped in #86 and saves the resulting model artifacts + SHAP summary that the narrative layer (T-08) consumes next. T-06 — Churn pipeline (AC-3.1〜3.5): - ml/churn.py — fits a LogisticRegression baseline AND an XGBoost classifier on `churn_features`, picks the higher hold-out ROC-AUC, and saves model.pkl + metadata.json + shap_summary.json. - ml/explain.py — SHAP wrapper used by both the churn and (later) narrative paths. TreeExplainer first, falls back to model-agnostic. - ml/_io.py — shared mart loader, fails with clear errors when the warehouse / mart is missing (AC-3.4). T-07 — Upsell propensity (AC-3.6〜3.7): - ml/upsell.py — fits a LogisticRegression propensity model on `upsell_opportunities`, measures hold-out ROC-AUC and lift @ top-10%, raises if the lift falls below the 1.5× floor. Data-generator amendment: the churn signal in `data/generate.py` was under-engineered (best ROC-AUC was 0.6972, just below the AC-3.2 0.70 floor). Reworked the event generator so churned customers (a) get 4× lower event weight and (b) have their timestamps biased into the older half of the history window. The mart's `recent_to_lifetime_ratio` feature now correlates cleanly with the cancel label, pushing churn ROC-AUC to 0.7448 on a seed=42 / n_customers=1000 run. Local verify (Python 3.12 venv, deterministic seed=42): - `make data` + `make dbt` + `make ml` end-to-end OK - Churn ROC-AUC = 0.7448 (LR wins; XGBoost 0.7196), AC-3.2 PASS - Upsell lift @ top-10% = 2.81× (vs 1.5× floor), AC-3.7 PASS - ruff OK / mypy OK / pytest 15 PASS, coverage 86.75% - doc-drift 0 fail / adr-claims 77/77 Test infra: switched from `subprocess.run(["dbt", ...])` to `dbt.cli.main.dbtRunner` so the fixtures work on Windows without venv Scripts being on PATH. DuckDB rw-mode for both dbt + ml avoids the "different configuration" connection error when both run in-process.

vercel · 2026-05-17T14:02:47Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
craftstack-collab	Ready	Preview, Comment	May 17, 2026 2:04pm
craftstack-knowledge	Ready	Preview, Comment	May 17, 2026 2:04pm

…#88) Phase 4 of the data-analytics-demo bolt-on. Reads the SHAP summary that the ML layer (#87) writes, sends a templated prompt to a local Ollama daemon, and saves an executive-facing markdown narrative — never touching a cloud LLM API. What lands: - narrative/ollama_client.py — env-var-gated host/model resolution (defaults: localhost:11434 + llama3.1:8b-instruct-q4_K_M), AC-4.3 assertion that no cloud-LLM credentials are present at invocation, AC-4.2 remediation hint when Ollama is unreachable. - narrative/prompts.py — the executive-brief prompt template; SHAP-summary rendering is the only call point. - narrative/generate.py — orchestration. Reads shap_summary.json, builds the prompt, calls Ollama, wraps the body with provenance metadata (model id, SHAP source path, timestamp, "external calls: 0" assertion-enforced advertisement) — satisfies AC-4.1, AC-4.4, AC-4.5. - tests/test_narrative.py — 7 cases covering AC-4.1〜4.5 plus missing-data and prompt-builder paths. Uses monkeypatch to stub `ollama.Client` so no network is required in CI. - Makefile narrative target + cli.py narrative subcommand wire-up. AC coverage (mock-Ollama tests + real-Ollama smoke locally): - AC-4.1 produces output.md PASS - AC-4.2 unreachable Ollama PASS (clear RuntimeError with "ollama serve" hint) - AC-4.3 external API guard PASS (raises before any client call) - AC-4.4 cites shap_summary.json PASS - AC-4.5 model identifier in output PASS Local verify: - ruff OK / mypy OK (14 source files) / pytest 22 PASS / coverage 87.20% - Real smoke vs Ollama (gemma3:4b, env-var override): 3-paragraph executive narrative produced end-to-end, all metadata fields present. Design note: the literal AC-4.5 default model name is preserved as the package default; deployments running a different quantized variant can override via the OLLAMA_MODEL env var without code changes. Co-authored-by: leagames0221-sys <leagames0221@users.noreply.github.com>

Phase 3 of the data-analytics-demo bolt-on. Trains two propensity models on the dbt marts shipped in #86 and saves the resulting model artifacts + SHAP summary that the narrative layer (T-08) consumes next. T-06 — Churn pipeline (AC-3.1〜3.5): - ml/churn.py — fits a LogisticRegression baseline AND an XGBoost classifier on `churn_features`, picks the higher hold-out ROC-AUC, and saves model.pkl + metadata.json + shap_summary.json. - ml/explain.py — SHAP wrapper used by both the churn and (later) narrative paths. TreeExplainer first, falls back to model-agnostic. - ml/_io.py — shared mart loader, fails with clear errors when the warehouse / mart is missing (AC-3.4). T-07 — Upsell propensity (AC-3.6〜3.7): - ml/upsell.py — fits a LogisticRegression propensity model on `upsell_opportunities`, measures hold-out ROC-AUC and lift @ top-10%, raises if the lift falls below the 1.5× floor. Data-generator amendment: the churn signal in `data/generate.py` was under-engineered (best ROC-AUC was 0.6972, just below the AC-3.2 0.70 floor). Reworked the event generator so churned customers (a) get 4× lower event weight and (b) have their timestamps biased into the older half of the history window. The mart's `recent_to_lifetime_ratio` feature now correlates cleanly with the cancel label, pushing churn ROC-AUC to 0.7448 on a seed=42 / n_customers=1000 run. Local verify (Python 3.12 venv, deterministic seed=42): - `make data` + `make dbt` + `make ml` end-to-end OK - Churn ROC-AUC = 0.7448 (LR wins; XGBoost 0.7196), AC-3.2 PASS - Upsell lift @ top-10% = 2.81× (vs 1.5× floor), AC-3.7 PASS - ruff OK / mypy OK / pytest 15 PASS, coverage 86.75% - doc-drift 0 fail / adr-claims 77/77 Test infra: switched from `subprocess.run(["dbt", ...])` to `dbt.cli.main.dbtRunner` so the fixtures work on Windows without venv Scripts being on PATH. DuckDB rw-mode for both dbt + ml avoids the "different configuration" connection error when both run in-process. Co-authored-by: leagames0221-sys <leagames0221@users.noreply.github.com>

…#88) Phase 4 of the data-analytics-demo bolt-on. Reads the SHAP summary that the ML layer (#87) writes, sends a templated prompt to a local Ollama daemon, and saves an executive-facing markdown narrative — never touching a cloud LLM API. What lands: - narrative/ollama_client.py — env-var-gated host/model resolution (defaults: localhost:11434 + llama3.1:8b-instruct-q4_K_M), AC-4.3 assertion that no cloud-LLM credentials are present at invocation, AC-4.2 remediation hint when Ollama is unreachable. - narrative/prompts.py — the executive-brief prompt template; SHAP-summary rendering is the only call point. - narrative/generate.py — orchestration. Reads shap_summary.json, builds the prompt, calls Ollama, wraps the body with provenance metadata (model id, SHAP source path, timestamp, "external calls: 0" assertion-enforced advertisement) — satisfies AC-4.1, AC-4.4, AC-4.5. - tests/test_narrative.py — 7 cases covering AC-4.1〜4.5 plus missing-data and prompt-builder paths. Uses monkeypatch to stub `ollama.Client` so no network is required in CI. - Makefile narrative target + cli.py narrative subcommand wire-up. AC coverage (mock-Ollama tests + real-Ollama smoke locally): - AC-4.1 produces output.md PASS - AC-4.2 unreachable Ollama PASS (clear RuntimeError with "ollama serve" hint) - AC-4.3 external API guard PASS (raises before any client call) - AC-4.4 cites shap_summary.json PASS - AC-4.5 model identifier in output PASS Local verify: - ruff OK / mypy OK (14 source files) / pytest 22 PASS / coverage 87.20% - Real smoke vs Ollama (gemma3:4b, env-var override): 3-paragraph executive narrative produced end-to-end, all metadata fields present. Design note: the literal AC-4.5 default model name is preserved as the package default; deployments running a different quantized variant can override via the OLLAMA_MODEL env var without code changes. Co-authored-by: leagames0221-sys <leagames0221@users.noreply.github.com>

vercel Bot deployed to Preview – craftstack-knowledge May 17, 2026 14:03 View deployment

vercel Bot deployed to Preview – craftstack-collab May 17, 2026 14:04 View deployment

leagames0221-sys merged commit 2957a1e into main May 17, 2026
12 checks passed

leagames0221-sys deleted the feat/data-analytics-demo-t06-t07-ml branch May 17, 2026 14:15

leagames0221-sys mentioned this pull request May 17, 2026

feat(data-analytics-demo): T-13 docs + T-14 changelog/handoff (Stage 4 close-out) #92

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(data-analytics-demo): T-06 churn + T-07 upsell ML pipelines#87

feat(data-analytics-demo): T-06 churn + T-07 upsell ML pipelines#87
leagames0221-sys merged 1 commit into
mainfrom
feat/data-analytics-demo-t06-t07-ml

leagames0221-sys commented May 17, 2026

Uh oh!

vercel Bot commented May 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leagames0221-sys commented May 17, 2026

Summary

What lands

AC coverage (seed=42, n_customers=1000)

Generator amendment

Test-infra notes

Local verify

Test plan

Uh oh!

vercel Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 17, 2026 •

edited

Loading