diff --git a/docs/adr/0070-data-analytics-demo-polyglot-adoption.md b/docs/adr/0070-data-analytics-demo-polyglot-adoption.md index 74da07d..3df424f 100644 --- a/docs/adr/0070-data-analytics-demo-polyglot-adoption.md +++ b/docs/adr/0070-data-analytics-demo-polyglot-adoption.md @@ -59,7 +59,7 @@ Cube.js is JS-only, which would push the semantic layer back into the TS side an ## Security mitigations - **DuckDB ≥ 1.4.2** pin in `pyproject.toml`. Closes [GHSA-vmp8-hg63-v2hp / CVE-2025-64429](https://github.com/duckdb/duckdb/security/advisories/GHSA-vmp8-hg63-v2hp) (encryption crypto, medium severity, all `>= 1.4.0` affected, patched in 1.4.2). The CSV-sniff bypass ([GHSA-w2gf-jxc9-pf2q / CVE-2024-41672](https://github.com/duckdb/duckdb/security/advisories/GHSA-w2gf-jxc9-pf2q), patched 1.1.0) is also covered transitively. The GitHub Actions injection advisory ([GHSA-7q92-pph9-5686](https://github.com/duckdb/duckdb/security/advisories/GHSA-7q92-pph9-5686)) has no release impact. -- **No external API credentials.** `.env.example` ships placeholders only (deferred to T-03 when env vars first matter). Ollama runs locally; the narrative module asserts the absence of external-API env vars at invocation time (AC-4.3). +- **No external API credentials.** Environment variables (synthetic-row counts, deterministic seed, local Ollama host / model) are documented in the package README under "Environment variables"; every variable has a code-level default so `.env` is optional. Ollama runs locally; the narrative module asserts the absence of external-API env vars at invocation time (AC-4.3) — `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` / `GEMINI_API_KEY` / `GOOGLE_API_KEY` / `AZURE_OPENAI_API_KEY` / `COHERE_API_KEY` set in the environment causes `make narrative` to fail-stop with a remediation message. - **`pip-audit` in CI**, fail on HIGH or CRITICAL severity. Wired in T-12 via a new `.github/workflows/python-audit.yml` workflow that runs alongside the existing `pnpm-audit.yml`. - **Dependabot Python ecosystem.** T-12 adds the `pip` ecosystem to `.github/dependabot.yml` so security upgrades surface as PRs. - **Generated artifacts gitignored.** `warehouse/*.duckdb`, `ml/artifacts/*`, `dashboard/build/`, `narrative/output.md` never enter the repo, removing accidental data-leak surface. diff --git a/packages/data-analytics-demo/README.md b/packages/data-analytics-demo/README.md index c8f31cb..08d3afa 100644 --- a/packages/data-analytics-demo/README.md +++ b/packages/data-analytics-demo/README.md @@ -54,6 +54,22 @@ semantic MetricFlow YAML — 3 semantic models, 4 KPI metrics; structural inv See [docs/architecture.md](docs/architecture.md) for the pipeline diagram and per-layer details. +## Environment variables + +Every variable has a code-level default — `.env` is optional. Defaults shown match the in-code values. + +| Variable | Default | Purpose | +| ---------------------- | ----------------------------- | ---------------------------------------------------------------------------------------------------- | +| `DEMO_RANDOM_SEED` | `42` | Master seed for Faker + numpy + sklearn; controls byte-deterministic regeneration (AC-1.5 / AC-δ.2). | +| `DEMO_N_CUSTOMERS` | `1000` | Row count for the `customers` table. | +| `DEMO_N_SUBSCRIPTIONS` | `2000` | Row count for the `subscriptions` table. | +| `DEMO_N_EVENTS` | `50000` | Row count for the `events` table. | +| `DEMO_N_INVOICES` | `5000` | Row count for the `invoices` table. | +| `OLLAMA_HOST` | `http://localhost:11434` | Local Ollama daemon endpoint used by the narrative layer. | +| `OLLAMA_MODEL` | `llama3.1:8b-instruct-q4_K_M` | Ollama model identifier; must already be pulled (`ollama pull `). | + +**Prohibited variables (AC-4.3 fail-stop):** the narrative layer raises `RuntimeError` at invocation if any of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, `GOOGLE_API_KEY`, `AZURE_OPENAI_API_KEY`, `COHERE_API_KEY` is set. All inference is local. + ## Constraints (load-bearing — see [ADR-0070](../../docs/adr/0070-data-analytics-demo-polyglot-adoption.md)) - **Zero credit card.** No Snowflake / BigQuery free trial; no Anthropic / OpenAI / Gemini API.