leagames0221-sys · leagames0221-sys · May 17, 2026 · May 17, 2026
@@ -1,8 +1,8 @@
 # ADR-0070: Adopt polyglot (Python + TypeScript) for `packages/data-analytics-demo` — local-only SaaS customer-analytics demo
 
-- Status: Accepted
-- Date: 2026-05-17
-- Tags: architecture, polyglot, data-analytics, dbt, evidence, ollama, security, supply-chain
+- Status: Accepted (amended 2026-05-18 — dashboard pivoted from Evidence to a self-built Python+Jinja2+Plotly generator; see "2026-05-18 amendment" section below)
+- Date: 2026-05-17 (original), 2026-05-18 (amendment)
+- Tags: architecture, polyglot, data-analytics, dbt, ollama, security, supply-chain
 - Companions: [ADR-0001](0001-monorepo-turborepo-pnpm.md) (the monorepo layout this ADR extends with a polyglot package)
 
 ## Context
@@ -104,3 +104,34 @@ Rejected: **dbt-labs/jaffle-shop-template** — no LICENSE file in default branc
 - **TypeScript-only**: rejected — see Tradeoff 1. Cannot meet the quality bar with current TS data-analytics tooling.
 - **Separate repo (polyrepo)**: rejected — contradicts the monorepo decision in ADR-0001 and forfeits the "complex portfolio operated as a single deliverable" interview signal.
 - **Defer the demo**: rejected — the contract brief is live; deferring loses the matching window.
+
+## 2026-05-18 amendment — dashboard pivot
+
+The original Tradeoff 4 chose Evidence as the dashboard generator. Evidence is a high-quality OSS tool (MIT, evidence-dev/evidence, 6k+ stars) and the rationale stands on paper, but the integration cost in this monorepo turned out to be unbounded:
+
+- Evidence ships a SvelteKit-based build (`evidence build`) that requires its own flat `node_modules` for `@sveltejs/kit`, `vite`, `@evidence-dev/tailwind`, and several other transitive peers to be resolvable from generated template code.
+- Pnpm 10's isolated layout and strict build-script approval gate broke this in three different ways on consumer Windows; each fix surfaced the next missing peer (chain of four+ peer-dep resolution failures locally before pivoting).
+- The dashboard sits at the seam between the Python pipeline (data + dbt + ML + narrative) and the static HTML output. Adopting Evidence meant adopting a second package manager (pnpm or npm) inside an otherwise-Python sub-tree, with its own audit + Dependabot + CI surface.
+
+**Decision**: replace Evidence with a self-built Python+Jinja2+Plotly generator that lives entirely inside `src/data_analytics_demo/dashboard/`. Adds two PyPI deps (jinja2 BSD, plotly MIT — both well-known and already on the audit allowlist) and ships ~150 lines of code that read the same dbt marts and write static HTML to `dashboard/build/`.
+
+### Why this is the better fit
+
+- **Smaller blast radius**: 2 PyPI deps instead of 629 npm deps with the associated peer-dep tangle. Pip-audit covers the surface.
+- **Single toolchain**: the dashboard now runs through the same Python venv, ruff, mypy, pytest gates as the rest of the package; no second package manager, no separate workflow.
+- **Stronger portfolio signal**: "self-built static dashboard generator from synthetic SaaS marts" reads as analytics-engineering breadth; "I configured Evidence" reads as tool adoption.
+- **Full layout control**: Plotly figures + Jinja2 templates give the demo the same chart types Evidence was going to produce (bar / scatter / line / area / heatmap / data table) without the SvelteKit indirection.
+
+### Tradeoff 4 (revised)
+
+| Option                                    | Status               | Why                                                                    |
+| ----------------------------------------- | -------------------- | ---------------------------------------------------------------------- |
+| **Python + Jinja2 + Plotly (self-built)** | adopted              | Single toolchain, 2 PyPI deps, full control, audit-clean               |
+| Evidence                                  | rejected             | Peer-dep chain unbounded in this monorepo; second toolchain added cost |
+| Streamlit                                 | rejected (unchanged) | Requires a Python server at view time; no static export                |
+| Quarto                                    | rejected (unchanged) | BI focus weaker than the alternatives; CLI install required            |
+| Apache Superset                           | rejected (unchanged) | Full server with significant install overhead                          |
+
+### What the rest of this ADR still gets right
+
+Tradeoffs 1 (polyglot), 2 (DuckDB + Faker synthetic data), 3 (dbt), 5 (Ollama), and 6 (MetricFlow) are unchanged. The security mitigations (DuckDB ≥ 1.4.2 pin, pip-audit, Dependabot) and the polyglot CI structure carry over.
@@ -36,8 +36,7 @@ narrative:
 	$(PYTHON) -m data_analytics_demo.narrative.generate
 
 dashboard:
-	@echo "[dashboard] TODO T-09: Evidence dashboard not yet implemented"
-	@exit 1
+	$(PYTHON) -m data_analytics_demo.dashboard.render
 
 semantic-validate:
 	@echo "[semantic-validate] TODO T-10: MetricFlow validation not yet implemented"

@@ -28,6 +28,9 @@ dependencies = [
     # CLI + data validation
     "typer>=0.14",
     "pydantic>=2.9",
+    # Dashboard (self-built static HTML; replaces Evidence — see ADR-0070 amendment).
+    "jinja2>=3.1",
+    "plotly>=5.24",
 ]
 
 [project.optional-dependencies]
@@ -74,7 +77,7 @@ mypy_path = "src"
 # but lags behind `pandas` releases; treating these as untyped is the
 # pragmatic choice for a Python 3.11 + pandas 3.x stack.
 [[tool.mypy.overrides]]
-module = ["pandas", "pandas.*", "duckdb", "faker", "shap", "xgboost", "sklearn.*"]
+module = ["pandas", "pandas.*", "duckdb", "faker", "shap", "xgboost", "sklearn.*", "plotly", "plotly.*"]
 ignore_missing_imports = true
 
 [tool.pytest.ini_options]

@@ -45,6 +45,15 @@ def ml() -> None:
     )
 
 
+@app.command()
+def dashboard() -> None:
+    """Render the static HTML dashboard into dashboard/build/."""
+    from data_analytics_demo.dashboard import render as dashboard_render
+
+    out = dashboard_render.main()
+    typer.echo(f"wrote dashboard pages to {out}")
+
+
 @app.command()
 def narrative() -> None:
     """Generate an executive narrative from SHAP via local Ollama."""

@@ -0,0 +1,9 @@
+"""Self-built static-HTML dashboard generator (replaces Evidence per ADR-0070 amend).
+
+Reads marts from `warehouse/analytics.duckdb`, builds Plotly figures, and
+renders Jinja2 templates into `dashboard/build/{index,rfm,churn,kpi}.html`.
+
+Pure Python — no npm, no SvelteKit, no peer-dep chains. Build is
+single-process and reproducible via the same seed that feeds the data
+generator.
+"""
@@ -0,0 +1,97 @@
+"""Plotly figure builders. Each function returns an HTML string ready to embed."""
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+import plotly.express as px
+
+if TYPE_CHECKING:
+    import pandas as pd
+
+# CDN keeps the per-page HTML small (~10KB instead of 4MB inline plotly.js).
+PLOTLY_JS_MODE = "cdn"
+
+
+def _to_div(fig: object) -> str:
+    """Render a plotly figure as a div fragment (no <html><body>)."""
+    html: str = fig.to_html(  # type: ignore[attr-defined]
+        include_plotlyjs=PLOTLY_JS_MODE,
+        full_html=False,
+        config={"displaylogo": False},
+    )
+    return html
+
+
+def rfm_bar(df: pd.DataFrame) -> str:
+    fig = px.bar(
+        df,
+        x="rfm_segment",
+        y="customers",
+        text="customers",
+        title="Customers per RFM segment",
+    )
+    fig.update_layout(xaxis_title="Segment", yaxis_title="Customers", height=400)
+    return _to_div(fig)
+
+
+def rfm_scatter(df: pd.DataFrame) -> str:
+    fig = px.scatter(
+        df,
+        x="recency_days",
+        y="frequency_events",
+        color="rfm_segment",
+        size="monetary_usd",
+        hover_data=["customer_id"],
+        title="Recency × Frequency (size = monetary)",
+    )
+    fig.update_layout(
+        xaxis_title="Recency (days; lower is better)",
+        yaxis_title="Frequency (event count)",
+        height=520,
+    )
+    return _to_div(fig)
+
+
+def churn_by_tier_bar(df: pd.DataFrame) -> str:
+    fig = px.bar(
+        df,
+        x="current_plan_tier",
+        y="churn_pct",
+        text="churn_pct",
+        title="Churn rate by plan tier",
+    )
+    fig.update_layout(xaxis_title="Plan tier", yaxis_title="Churn %", height=400)
+    return _to_div(fig)
+
+
+def signups_line(df: pd.DataFrame) -> str:
+    fig = px.line(df, x="month", y="signups", title="Monthly signups")
+    fig.update_layout(xaxis_title="Month", yaxis_title="New customers", height=400)
+    return _to_div(fig)
+
+
+def paid_invoice_area(df: pd.DataFrame) -> str:
+    fig = px.area(
+        df,
+        x="month",
+        y="paid_amount_usd",
+        title="Paid invoice volume per month (USD)",
+    )
+    fig.update_layout(xaxis_title="Month", yaxis_title="USD", height=400)
+    return _to_div(fig)
+
+
+def cohort_heatmap(df: pd.DataFrame) -> str:
+    pivot = df.pivot_table(
+        index="cohort_month", columns="months_since_signup", values="retention_pct"
+    )
+    fig = px.imshow(
+        pivot,
+        labels={"x": "Months since signup", "y": "Cohort month", "color": "Retention %"},
+        title="Cohort retention heatmap",
+        color_continuous_scale="Blues",
+        aspect="auto",
+    )
+    fig.update_layout(height=480)
+    return _to_div(fig)
@@ -0,0 +1,126 @@
+"""SQL queries against the dbt marts.
+
+Each function takes an open DuckDB connection and returns a DataFrame.
+Centralising the SQL here keeps the templates focused on layout.
+"""
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    import duckdb
+    import pandas as pd
+
+
+def _scalar(con: duckdb.DuckDBPyConnection, sql: str) -> float:
+    """Run a single-cell aggregate query and return the value (or 0 if empty)."""
+    row = con.execute(sql).fetchone()
+    if row is None:
+        return 0.0
+    return float(row[0])
+
+
+def headline_metrics(con: duckdb.DuckDBPyConnection) -> dict[str, float]:
+    """Top-of-page numbers — customers, active rate, churn rate."""
+    n_customers = _scalar(con, "select count(*) from customers")
+    active_rate = _scalar(
+        con,
+        "select coalesce(avg(case when status='active' then 1.0 else 0.0 end)*100, 0) "
+        "from subscriptions",
+    )
+    churn_rate = _scalar(
+        con, "select coalesce(avg(is_churned)*100, 0) from churn_features"
+    )
+    return {
+        "customers": int(n_customers),
+        "active_rate": round(active_rate, 1),
+        "churn_rate": round(churn_rate, 1),
+    }
+
+
+def rfm_distribution(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select rfm_segment, count(*) as customers, round(avg(monetary_usd), 0) as avg_monetary
+        from rfm_segments
+        group by rfm_segment
+        order by customers desc
+        """
+    ).fetchdf()
+
+
+def rfm_scatter(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select customer_id, recency_days, frequency_events, monetary_usd, rfm_segment
+        from rfm_segments
+        """
+    ).fetchdf()
+
+
+def churn_by_tier(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select
+          current_plan_tier,
+          count(*) as customers,
+          round(avg(is_churned)*100, 1) as churn_pct,
+          round(avg(events_last_30d), 1) as avg_events_30d
+        from churn_features
+        group by current_plan_tier
+        order by churn_pct desc
+        """
+    ).fetchdf()
+
+
+def churn_activity_buckets(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select
+          case
+            when recent_to_lifetime_ratio is null then 'no activity'
+            when recent_to_lifetime_ratio < 0.3 then '0.0 – 0.3 (slowing)'
+            when recent_to_lifetime_ratio < 0.7 then '0.3 – 0.7'
+            when recent_to_lifetime_ratio < 1.5 then '0.7 – 1.5 (steady)'
+            else '1.5+ (accelerating)'
+          end as activity_bucket,
+          count(*) as customers,
+          round(avg(is_churned)*100, 1) as churn_pct
+        from churn_features
+        group by activity_bucket
+        order by churn_pct desc
+        """
+    ).fetchdf()
+
+
+def monthly_signups(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select date_trunc('month', signup_date) as month, count(*) as signups
+        from customers
+        group by 1 order by 1
+        """
+    ).fetchdf()
+
+
+def monthly_paid_invoice_volume(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select date_trunc('month', period_start) as month,
+               sum(amount_usd) as paid_amount_usd
+        from invoices
+        where status = 'paid'
+        group by 1 order by 1
+        """
+    ).fetchdf()
+
+
+def cohort_retention_grid(con: duckdb.DuckDBPyConnection) -> pd.DataFrame:
+    return con.execute(
+        """
+        select cohort_month, months_since_signup, retention_pct
+        from cohort_retention
+        order by cohort_month, months_since_signup
+        """
+    ).fetchdf()