[Code] • [Paper (coming soon)]
Causal AI Scientist (CAIS) is an LLM-powered tool for generating data-driven answers to natural language causal queries. Given a natural language query (e.g., "Does participating in a job training program lead to higher income?"), an accompanying dataset, and its description, CAIS frames a suitable causal estimation problem, selects an appropriate inference method, executes it, runs diagnostic checks, and interprets the results in plain language.
Note: This repository is a work in progress and will be updated with additional instructions and files.
- Introduction
- Pipeline
- Getting Started
- Dataset Information
- Running CAIS
- Reproducing Paper Results
- Citation
- License
Causal effect estimation is central to evidence-based decision-making across domains such as the social sciences, healthcare, and economics, but it requires substantial methodological expertise to apply correctly.
CAIS automates this process end-to-end using Large Language Models (LLMs) to:
- Parse a natural language causal query and analyze dataset characteristics.
- Select an appropriate causal inference method via a decision tree and structured prompting.
- Execute the method using predefined code templates and validate the results.
- Interpret the numerical output in the context of the original query.
Supported Methods:
- Econometric: Difference-in-Differences (DiD), Instrumental Variables (IV), Ordinary Least Squares (OLS), Regression Discontinuity Design (RDD).
- Causal Graph-based: Backdoor adjustment, Frontdoor adjustment.
CAIS consists of four successive stages, powered by a decision-tree-driven reasoning pipeline:
- Profiles the dataset (column types, missing values, statistical distributions) and uses an LLM to identify treatment, outcome, and covariate variables.
- Scans for method-specific variables such as instruments and running variables based on the dataset description and causal query.
- Traverses a rule-based decision tree that evaluates dataset properties (e.g., randomization, presence of temporal structure, availability of instruments) to select a valid causal inference method.
- Breaking selection into explicit, verifiable steps ensures interpretability and avoids the opacity of direct LLM-based method selection.
If the Instrumental Variable (IV) method is selected and the --iv_llm pipeline is enabled (based on IV Co-Scientist):
- Hypothesis Generation: The LLM hypothesizes potential instruments based on dataset context and variable names.
- Confounder Mining: Identifies potential confounders that might violate the independence or exclusion restrictions.
- Critic Validation: Uses specialized LLM "critics" (Exclusion, Independence) to reason about the validity of each candidate instrument.
- Final Selection: Selects the most robust instrument for the estimation stage.
- Runs standard statistical assumption checks for the selected method (e.g., the F-statistic for IV, covariate balance for OLS).
- If any check fails, initiates a feedback loop back to Stage 2, incorporating information from the failure to skip the invalid method and identify the next plausible candidate.
- Executes the chosen method using predefined Python code templates with placeholders substituted from Stage 1, maximizing reliability over LLM-generated code.
- Prompts an LLM to interpret the estimated causal effect, standard error, and confidence interval in the context of the original query, alongside validation caveats and a clear statement of assumptions and limitations.
Prerequisites:
- Python 3.10
- Conda (recommended)
Step 1: Clone the repository and copy the example configuration
git clone https://github.com/causalNLP/causal-agent.git
cd causal-agent
cp .env.example .envStep 2: Load necessary compute modules
module load rust
module load gcc
module load openblasStep 3: Create and activate a Python 3.10 environment
conda create -n cais python=3.10
conda activate cais
pip install -r requirements.txtStep 4: Install the CAIS library
pip install -e .
⚠️ Keep your.envfile secure and never commit it to version control.
All datasets used to evaluate CAIS and the baseline models are available in the data/ directory:
| Path | Description |
|---|---|
data/all_data/ |
CSV files from the QRData and real-world study collections |
data/synthetic_data/ |
CSV files for synthetic datasets |
data/qr_info.csv |
Metadata for QRData: filename, description, query, reference effect, intended method, remarks |
data/real_info.csv |
Metadata for real-world datasets |
data/synthetic_info.csv |
Metadata for synthetic datasets |
python run_cais_new.py \
--metadata_path <path_to_metadata_csv> \
--data_dir <path_to_data_folder> \
--output_dir <output_folder> \
--output_name <output_filename> \
--llm_name <llm_name> \
--llm_provider <llm_provider> \
[--iv_llm]Arguments:
| Argument | Type | Description |
|---|---|---|
--metadata_path |
str |
Path to the CSV file containing queries, dataset descriptions, and filenames |
--data_dir |
str |
Path to the folder containing the data in CSV format |
--output_dir |
str |
Path to the folder where output JSON results will be saved |
--output_name |
str |
Name of the output JSON file |
--llm_name |
str |
Name of the LLM to use (e.g., gpt-4o, claude-3-5-sonnet) |
--llm_provider |
str |
LLM service provider (e.g., openai, anthropic, together) |
--iv_llm |
bool |
(Optional) If present, enables the advanced experimental IV-LLM pipeline for instrument discovery. |
Example:
python run_cais_new.py \
--metadata_path "data/qr_info.csv" \
--data_dir "data/all_data" \
--output_dir "output" \
--output_name "results_qr_4o" \
--llm_name "gpt-4o-mini" \
--llm_provider "openai" \
--iv_llmWill be updated soon
If you use CAIS or build on this work, we would appreciate it if you could cite:
@inproceedings{
verma2025causal,
title={Causal {AI} Scientist: Facilitating Causal Data Science with Large Language Models},
author={Vishal Verma and Sawal Acharya and Devansh Bhardwaj and Samuel Simko and Yongjin Yang and Anahita Haghighat and Dominik Janzing and Mrinmaya Sachan and Bernhard Sch{\"o}lkopf and Zhijing Jin},
booktitle={NeurIPS 2025 Workshop on CauScien: Uncovering Causality in Science},
year={2025},
url={https://openreview.net/forum?id=EDWTHMVOCj}
}The IV-LLM pipeline builds on the methodology introduced in IV Co-Scientist. If you use that component, please also cite:
@misc{sheth2026ivcoscientistmultiagentllm,
title={IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery},
author={Ivaxi Sheth and Zhijing Jin and Bryan Wilder and Dominik Janzing and Mario Fritz},
year={2026},
eprint={2602.07943},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2602.07943}
}Distributed under the MIT License. See LICENSE for more information.
