Causal Inference Dashboard

Interactive Dash app for comparing causal measurement approaches on a real randomised marketing experiment (Hillstrom, 2008).

Live demo: Hugging Face Space

This project provides a dashboard to:

estimate average treatment effects with uncertainty
inspect heterogeneity and targeting value
show where each method agrees and disagrees, and what assumptions drive the result

Why This Project Matters

Teams might ask two different questions:

"Did the campaign work on average?" (causal effect / ATE)
"Who should we target next?" (HTE / uplift policy)

This dashboard puts both views side-by-side so the methodological choices and any business implications are easily comparable.

Methods Covered

Tab	Method	Role in this project
2	PSM sensitivity (propensity matching + caliper)	Observational-style diagnostic; matched ATT on pruned cohort vs control
3	Bayesian A/B (PyMC hurdle model)	Probabilistic effect estimation with posterior uncertainty
4	Uplift / HTE (T-Learner, S-Learner)	Ranking customers by estimated incremental value
5	Multi-Arm OLS with interactions	Precision-adjusted average effects and subgroup patterns
6	Method Comparison	Side-by-side estimate reconciliation and takeaway

Dataset

Source: MineThatData Email Analytics (Hillstrom)

Randomised experiment across ~64,000 customers:

Men's, Women's and Control (split roughly equal-sized three ways)
Primary outcome: 2-week post-campaign spend (USD)
Key covariates: recency, history, mens/womens indicators, zip code, newbie, channel

Quick Start

Dependencies are managed with uv (pyproject.toml + uv.lock).

Requirements

uv installed
Python 3.13+

Install

From the project root:

uv sync

This creates .venv (if needed) and installs the locked dependency set.

Run

uv run python app.py

Open http://localhost:8050.

First-run behavior

First run precomputes models and caches results in .cache/results.pkl.
Subsequent runs load from cache and start quickly.
Depending on machine speed, initial build can take several minutes.

Force recompute

Delete .cache/results.pkl, or set USE_CACHE = False in causal_utils.py.
Restart the app once to rebuild the cache.
Set USE_CACHE back to True after a deliberate rebuild (optional; deleting the pickle has the same effect if USE_CACHE stays True).

Hugging Face Spaces (Docker)

Live Space: huggingface.co/spaces/jordancheney89/causality

This repo includes a Dockerfile configured for the Docker Spaces SDK

Reproducibility Notes

If you change estimation logic in causal_utils.py, delete .cache/results.pkl or use USE_CACHE = False, then rerun the app once.
uv.lock pins transitive versions; run uv lock after changing dependencies in pyproject.toml.

Deployment notes

The live app is deployed on Hugging Face Spaces using Docker.
Precomputed model outputs are stored in .cache/results.pkl so the app can start quickly without rerunning the Bayesian, PSM and uplift models on every container start.

GitHub `main` vs Hugging Face (`hf-space`)

Hugging Face Spaces reads YAML front matter at the very top of README.md. This repository keeps that metadata only on branch hf-space, while main (GitHub) uses a normal README without front matter.

Push targets

GitHub: git push origin main
Space (updates the Space repo’s main): git push space hf-space:main

After you change code on main, refresh the Space branch

git checkout hf-space
git merge main

If the merge removes or conflicts on the README header, put the Space YAML back as the first lines of README.md (then save, commit on hf-space). Use this block:

---
title: Causal Inference Dashboard
emoji: 📈
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
license: mit
---

Then push: git push space hf-space:main (and optionally git push origin hf-space to back the branch up on GitHub).

Methodology Notes and Caveats

The underlying dataset is randomized, so causal identification of average effects comes from random assignment.
Covariate-adjusted and matched analyses are included as precision, sensitivity, and interpretability tools.
Uplift metrics are useful for ranking policy decisions but should ideally be reported with uncertainty intervals when used for high-stakes targeting.
Subgroup interaction findings are exploratory unless multiplicity is explicitly controlled.

Results Snapshot

Agreement and disagreement across methods for Mens vs Control and Womens vs Control.
Posterior probability and HDI width in Bayesian A/B (effect magnitude + uncertainty).
Whether uplift curves and decile lift indicate actionable ranking value beyond random targeting.
Consistency between OLS interaction patterns and uplift heterogeneity signals.

Project Structure

.
├── Dockerfile             # Hugging Face Spaces (Docker SDK); gunicorn on port 7860
├── .dockerignore          # Smaller build context (excludes .venv, caches of dev tools)
├── app.py                 # Thin entrypoint: Dash app, theme registration, layout, callback wiring
├── causal_utils.py        # Data prep, caching, and all causal estimation logic
├── dashboard/
│   ├── theme.py           # Design tokens, Plotly template, shared style dicts
│   └── data.py            # Loads cache → exposes RESULTS, DF, PSM, BAYESIAN, UPLIFT, OLS
├── layouts/
│   ├── shell.py           # Navbar + tab container; imports per-tab layouts
│   ├── components.py      # Reusable UI helpers (KPI cards, section headers, methodology collapse)
│   ├── overview.py        # Tab 1 layout
│   ├── psm.py             # Tab 2 layout
│   ├── bayesian.py        # Tab 3 layout
│   ├── uplift.py          # Tab 4 layout
│   ├── ols.py             # Tab 5 layout
│   └── comparison.py      # Tab 6 layout
├── callbacks/
│   ├── __init__.py        # register_callbacks(app)
│   ├── overview.py        # Tab 1 callbacks
│   ├── psm.py             # Tab 2 callbacks
│   ├── bayesian.py        # Tab 3 callbacks
│   ├── uplift.py          # Tab 4 callbacks
│   ├── ols.py             # Tab 5 callbacks
│   └── comparison.py      # Tab 6 callbacks
├── figures/
│   └── overview.py        # Static Plotly helpers for Overview tab
├── content/
│   └── methodology.py     # Long-form copy separated from layout code
├── assets/
│   └── style.css          # Global styles (Dash serves /assets automatically)
├── .cache/                # Precomputed outputs (e.g. results.pkl)
├── pyproject.toml
├── uv.lock
├── .python-version
└── README.md

Roadmap

Add data ingestion wizard

License

MIT. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Causal Inference Dashboard

Why This Project Matters

Methods Covered

Dataset

Quick Start

Requirements

Install

Run

First-run behavior

Force recompute

Hugging Face Spaces (Docker)

Reproducibility Notes

Deployment notes

GitHub `main` vs Hugging Face (`hf-space`)

Methodology Notes and Caveats

Results Snapshot

Project Structure

Roadmap

License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.cache		.cache
assets		assets
callbacks		callbacks
content		content
dashboard		dashboard
figures		figures
layouts		layouts
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
causal_utils.py		causal_utils.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Causal Inference Dashboard

Why This Project Matters

Methods Covered

Dataset

Quick Start

Requirements

Install

Run

First-run behavior

Force recompute

Hugging Face Spaces (Docker)

Reproducibility Notes

Deployment notes

GitHub main vs Hugging Face (hf-space)

Methodology Notes and Caveats

Results Snapshot

Project Structure

Roadmap

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

GitHub `main` vs Hugging Face (`hf-space`)