This repository contains the full replication materials for an empirical analysis of BART fare gate installations and their effects on station-level ridership. It is structured to support end-to-end reproducibility, including raw data, analysis scripts/notebooks, and the final paper.
The repository is intended for academic replication, evaluation, and review. After unpacking the data archive, all scripts in scripts/notebooks/ should run without modification, subject to the software requirements listed below.
BART_Fare_Gates/
├── data/
│ └── bart_data.zip # Replication dataset (Git LFS tracked)
├── scripts/
│ └── notebooks/
│ ├── Regression_Analysis.ipynb
│ ├── Regression_Analysis_monthly.ipynb
│ ├── bart_data_cleaning.do
│ ├── bart_descriptive_stats.do
│ └── ...
├── AustinCoffelt_TermBARTPaper.pdf # Final paper
├── .gitignore
└── README.md
Notes:
- Generated figures, tables, and logs are intentionally excluded from version control.
- The data archive is tracked using Git LFS.
git clone https://github.com/austin7384/BART_Fare_Gates.git
cd BART_Fare_GatesIf using Git LFS for the first time:
git lfs install
git lfs pullUnzip the replication data in place so that the directory structure matches what the scripts expect:
cd data
unzip bart_data.zip
cd ..After unzipping, the data/ directory should contain the raw and intermediate data files referenced by the notebooks and Stata scripts.
All analysis code is located in:
scripts/notebooks/
The workflow proceeds in three logical stages:
-
Data cleaning and construction
bart_data_cleaning.do
-
Descriptive statistics
bart_descriptive_stats.do
-
Regression and event-study analysis
Regression_Analysis.ipynbRegression_Analysis_monthly.ipynb
The notebooks assume that the data has already been prepared by the Stata scripts. Output paths are relative and will be created automatically if they do not exist.
The analysis was developed and run using the following environment:
-
Stata (for
.dofiles) -
Python 3.9+
- pandas
- numpy
- matplotlib
- statsmodels
- jupyter
A standard scientific Python stack is sufficient. No proprietary Python packages are required.
- All file paths are relative; the project should run as-is once the data archive is unzipped.
- Randomness is not used in estimation; results should be exactly reproducible.
- Figures and tables in the paper can be regenerated from the scripts, though generated outputs are not tracked in Git.
The final paper is available in the repository root:
AustinCoffelt_TermBARTPaper.pdf
For questions regarding replication or code, please contact:
Austin Coffelt
This repository is provided for academic and research purposes. Please cite appropriately if using or extending this work.