A Python package for estimating metrics at small geographic areas (census places, tracts) from county-level data.
This package was developed by the Bloomberg Center for Government Excellence (GovEx) Research & Analytics team to address a common challenge: publicly available data often exists at the county-level, but not at smaller geographic levels like cities.
The default model uses a mixed linear regression model to learn the relationship between predictors (like poverty rate) and your outcome across all counties, while accounting for state-specific differences. This county-level pattern is then applied to estimate values for smaller areas.
Alternative models and additional features may perform better for specific use cases. The Trainer and Predictor classes include extension patterns.
This approach is inspired by the (Zhang et al. 2014) methodology, but works in reverse as they start with individual-level health survey data to aggregate up. Both use hierarchical geographic structure and poverty rate for estimations.
The only way to validate this is with datasets that exist at both levels. See examples/ground_truth_evaluation.ipynb for validation with ground truth on health insurance rates.
Before using this package:
- Check model performance on your county training data
- Validate with ground truth at smaller areas when possible
- Decide if performance is acceptable for your use case
Requires Python 3.12+
git clone https://github.com/govex/small-area-estimator.git
cd small-area-estimator
uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -e .from sae.data import FeatureExtractor
from sae.model import Trainer, Predictor
from sae.evaluation import Evaluator
# 1. Fetch county poverty rates (Census API key required)
extractor = FeatureExtractor(api_key="your_key")
county_features = extractor.get_county_poverty(years=range(2012, 2024))
# 2. Merge with your county target data (must have: geo_id, year, target_metric)
training_data = county_features.merge(your_county_data, on=['geo_id', 'year'])
# 3. Train model
trainer = Trainer(target_col='target_metric')
trainer.fit(training_data)
trainer.save('model.pkl')
# 4. Evaluate performance on counties
evaluator = Evaluator(trainer.model_dict)
evaluator.print_summary()
evaluator.make_diagnostic_plots(output_dir='outputs/')
# 5. Predict on places
places_features = extractor.get_place_poverty(years=[2024])
predictor = Predictor.load('model.pkl')
predictions = predictor.predict(places_features)- quick_start.py - Complete workflow script
- ground_truth_evaluation.py - Validation with a ground truth metric on health insurance rates
- infant_mortality_sae.py - County-to-place infant mortality estimation
- voter_turnout_sae_comparison.py - County-to-tract voter turnout estimation and alternative model/feature comparison
Your county-level target data needs:
geo_id: 5-digit county FIPS codeyear: Year of observation (integer)target_metric: The value you want to estimate. Use rates, percentages, or per-capita measures rather than raw counts that scale with population size.
For best results:
- Hundreds or thousands of county-year observations across multiple years
- Data from multiple states
- Target metric is correlated with poverty rate (otherwise add additional features)
After training, you'll see output like this (example values shown):
Model Performance Summary (on county training data)
RMSE: 11.86 (19.8%)
MAE: 9.89 (16.5%)
Mean target: 59.82
What this means:
- Your county predictions are typically off by ~10 percentage points (MAE in this example). MAE and RMSE is easy to interpret but doesn't tell the whole story so check diagnostic plots
- What constitutes "good" performance depends on your use case:
- MAE <5pp might be excellent for some metrics
- MAE of 10-15pp might be acceptable if no better alternative exists
- You're assuming small-area errors could be higher than county errors
The package generates four diagnostic plots to assess model quality. Review these to understand if your model assumptions hold:
What it shows: Blue dots = actual values, orange line = model predictions by state
What to look for:
- Orange line should follow through the blue dots
- Good fit = dots cluster around the line
- Poor fit = systematic gaps between dots and line
Red flags:
- Predictions consistently above or below actual values (bias)
- Line is flat when dots show a clear trend (model missing the relationship)
What it shows: Prediction errors (residuals) across poverty rates
What to look for:
- Random scatter around the dashed zero line
- No patterns or shapes (funnel, curve, clusters)
Red flags:
- Errors get bigger at high/low values (heteroscedasticity)
- Curve or other shape reflecting a non-linear relationship that the model is missing
- All points above or below zero (systematic bias)
What it shows: Whether prediction errors follow a normal distribution
What to look for:
- QQ Plot (diagonal dashed line): Points should follow the line, reflecting normal distribution
- Histogram: Bell-shaped, reflecting normally distributed prediction errors
Red flags:
- Non bell-shaped histogram: May reflect small sample size, consistent over/under-predictions, or groups of counties with different behaviors
- Curved tails in QQ plot: Upper tail curves up = predictions too high; lower tail curves down = predictions too low
- Many states showing these patterns: A couple weird states is to be expected, but if many states show these patterns than model assumptions may not hold
What it shows: Model performance varies by state
What to look for:
- Blue bars (RMSE) should be small relative to orange bars (mean target)
- Error bars (standard deviation) show data variability
Red flags:
- States with very large RMSE = poor fit in those states
- May need state-specific investigation or different approach
Train only on urban counties (for urban place predictions):
trainer = Trainer(
target_col='your_metric',
feature_filters={'urban_quartile': [0, 1]} # 50%+ urban
)Temporal validation (use only historical data):
trainer = Trainer(
target_col='your_metric',
year_filter='before',
year_value=2024
)This is an early-stage project. Contributions welcome! Open issues for bugs or feature requests, or share your SAE approach!
MIT License. See LICENSE for details.
Developed by the Bloomberg Center for Government Excellence (GovEx) Research & Analytics team.