Skip to content

nishreenk/ca-foster-care-panel-analysis

Repository files navigation

California Foster Care Panel Analysis

Does child welfare reform work? Did COVID permanently change California's foster care system?

This project uses 15 years of California county-level administrative data to evaluate two policy questions:

  1. Did AB 403 (Continuum of Care Reform, 2015) reduce foster care entries — and did it work where it was needed most?
  2. Did COVID-19 cause a permanent structural break in foster care entry rates, or a temporary disruption counties recovered from?

Summary of findings: AB 403 reduced entries most in the counties with the highest pre-reform rates. COVID caused a 25% drop in entries that never recovered — by 2024, California's entry rate was 30% below 2019 and still falling.

📄 Read the full analysis: Medium article


Results at a Glance

Finding Result
Statewide entry rate drop (2019 → 2020) −24.6%
Statewide entry rate vs 2019 in 2024 −30.2%
AB 403 interaction (high-entry counties) −0.152 pp per SD (p < 0.001)
COVID years effect (M3) −0.098 (p < 0.001)
Post-COVID effect (M3) −0.153 (p < 0.001)
Child poverty → entry rate (M5) +0.143 (p = 0.007)
Model fit (R², preferred M3) 0.716
Panel dimensions 58 counties × 15 years = 870 obs

Repository Structure

ca-foster-care-panel-analysis/
│
├── README.md
├── requirements.txt
│
├── data/                          # data folder (see instructions below)
│   ├── README_data.md             # exact download instructions for CCWIP
│   └── fc_acs_panel.csv           # ACS covariates (generated by fc_01)
│
├── figures/                       # all output figures (generated by scripts)
│
├── fc_01_load_data.py             # Step 1: load all sources, build panel
├── fc_02_eda.py                   # Step 2: exploratory data analysis
├── fc_02b_premodel_checks.py      # Step 3: pre-modelling validation
└── fc_03_model.py                 # Step 4: TWFE regression + event study

Data Sources

1. CCWIP — California Child Welfare Indicators Project

URL: https://ccwip.berkeley.edu

The primary data source. Requires a free account. Six files are needed:

File Report name on CCWIP Settings
Entry Rates Entry Rates All counties, Incidence per 1,000, Child Welfare only, 2010–present
In-Care Rates In Care Rates All counties, Prevalence per 1,000, Child Welfare only, 2010–present
4-P1 Permanency Federal Measures → 4-P1 Multi-county view, all years
Point-in-Time Point In Time / In Care All counties, all years
Exits Exits From Foster Care All counties, all years

Download each as Excel and save to the data/ folder. The loading script auto-detects filenames — they do not need to match exactly, only contain the keyword (EntryRates, InCareRates, P1, PIT, Exits).

Note on de-identification: CCWIP masks cells with values 1–10 per CDSS data de-identification guidelines. These appear as "M" in downloads and are treated as NaN in the analysis. The entry rate file has zero masked cells. Three counties (Alpine, Sierra, Mono) are excluded from the permanency event study due to sparse data.

Note on redistribution: CCWIP data is provided for research use and should not be redistributed. This repository does not include the raw CCWIP files.

2. ACS — American Community Survey

Pulled automatically via the Census API when fc_01_load_data.py is run with internet access. No manual download needed. Covers:

  • Child poverty rate (B17001)
  • Child population under 18 (B09001)
  • Racial composition — Black, Native American (B02001)
  • Hispanic share (B03003)
  • Housing cost burden — rent >30% of income (B25070)

The generated file fc_acs_panel.csv is included in this repository for convenience.


Quickstart

Requirements

pip install -r requirements.txt

requirements.txt:

pandas>=2.0
numpy>=1.24
scipy>=1.10
matplotlib>=3.7
openpyxl>=3.1

No R. No statsmodels. No proprietary tools. Runs in any standard scientific Python environment.

Run order

# 1. Load all data sources and build master panel
python fc_01_load_data.py

# 2. Exploratory data analysis — generates figures/fc_01 through fc_06
python fc_02_eda.py

# 3. Pre-modelling checks — parallel trends, skew, AB 403 signal
python fc_02b_premodel_checks.py

# 4. TWFE regression and COVID event study
python fc_03_model.py

Configure paths

At the top of each script, set:

UPLOADS = "data/"   # folder containing your CCWIP Excel files
DATA    = "data/"   # folder for output CSVs

Analytical Methods

Two-Way Fixed Effects (TWFE) Panel Regression

Outcome: sqrt(entry_rate) — square root transformation reduces right skew from 1.82 to 0.32.

Fixed effects:

  • County FEs: Remove stable differences between counties (geography, history, demographics). Before the regression runs, each county's own average entry rate across all years is subtracted from every one of its observations. This removes the stable baseline — the fact that Trinity is always high and Marin is always low has nothing to do with AB 403 or COVID, so it gets subtracted out. After this step the regression only sees how each county changed relative to its own history.
  • Year FEs: Remove common shocks affecting all counties (statewide policy, economic cycles). Each year's average entry rate across all counties is subtracted from every observation in that year. This removes the common year effect — in 2020, entries dropped everywhere because schools closed and mandatory reporters stopped seeing children. Subtracting the 2020 average strips out what every county shared, leaving only the county-specific deviation. What the regression analyses is the leftover after both subtractions — neither a stable county characteristic nor a common year shock.

Key predictors:

  • post_AB403 × pre_reform_rate_z: interaction testing differential AB 403 effect by pre-reform entry level
  • covid_year: indicator for 2020–2021
  • post_covid: indicator for 2022–2024
  • ACS covariates (M5 only): child poverty, racial composition, housing burden

Model specifications:

Model Specification Adj-R²
M1 County + Year FEs 0.704 0.678
M2 + AB 403 interaction 0.716 0.690
M3 + COVID indicators (preferred) 0.716 0.689
M4 M3 drop Trinity (sensitivity) 0.695 0.666
M5 M3 + ACS covariates 0.738 0.711

HC3 Robust Standard Errors

Implemented from scratch in NumPy — no statsmodels dependency. HC3 uses leverage correction to account for high-influence observations (small rural counties with extreme entry rates). The hat matrix diagonal is computed efficiently using np.einsum without materialising the full N×N matrix.

See the project report appendix for a plain-language explanation of HC3 and the NumPy implementation line by line.

COVID Event Study

Outcome: 4-P1 permanency rate (% achieving permanency in 12 months). Baseline: 2019 (omitted year). Covers: 55 counties, 2015–2024. Pre-trend joint F-test: F(4, 486) = 2.08, p = 0.082 → parallel trends assumption holds.

Pre-Modelling Validation (fc_02b_premodel_checks.py)

  1. Parallel pre-trends test — formal test for differential pre-trends by county tercile
  2. Outcome transformation check — raw vs log vs sqrt skewness by year
  3. AB 403 signal — pre-reform entry rate vs post-reform change, correlation test
  4. Data availability — masked cells by county and year
  5. Entry rate vs permanency — correlation check to justify modelling separately

Key Figures

Figure File What it shows
Statewide trends fc_02_statewide_trends.png Entry rate, in-care rate, and permanency 2010–2024
County variation fc_03_county_variation.png Pre-COVID entry rate by county, sorted
COVID event window fc_04_covid_event_window.png Indexed entry rate around 2020, each county to its own 2019 baseline
AB 403 signal fc_09_ab403_signal.png Pre-reform rate vs post-reform change
Parallel trends fc_07_parallel_trends.png Entry rate by county tercile 2010–2024
Event study fc_13_event_study_p1.png P1 permanency coefficients relative to 2019
P1 heatmap fc_10_p1_heatmap.png Permanency rates by county and year
Model residuals fc_12_modelA_residuals.png Residual diagnostics for M3

Limitations

  • The COVID entry rate drop reflects reduced detection by mandatory reporters, not necessarily reduced maltreatment
  • AB 403 implementation dates vary by county — a single post-2017 dummy is a simplification
  • ACS data ends at 2022; 2023–2024 values are carried forward from 2022
  • HC3 does not account for serial correlation within counties; cluster-robust SEs would be preferred in production

Citation

If you use this code or analysis, please cite:

Kachwala, N. (2025). California Foster Care Panel Analysis: AB 403 Reform
and COVID-19 Effects on Entry Rates (2010–2024).
GitHub: https://github.com/nishreenk/ca-foster-care-panel-analysis

Data citation:

CCWIP (2025). CCWIP reports. Retrieved 2025, from University of California
at Berkeley California Child Welfare Indicators Project website: ccwip.berkeley.edu

U.S. Census Bureau. American Community Survey 5-Year Estimates, 2010–2022.

Contact

Nishreen Kachwala GitHub: @nishreenk Medium: @nishreenk

About

California foster care entry rates 2010–2024 — panel regression evaluating AB 403 Continuum of Care Reform and COVID-19 structural break across 58 counties. Two-way fixed effects, event study, HC3 robust SEs. Python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages