fev-macro is a reproducible US real-GDP forecasting benchmark built on fev rolling-window evaluation. It combines historical FRED vintage panels, release-consistent GDP truth from ALFRED, and both classical and factor-style models. The repo is organized around one authoritative pipeline for panel building, evaluation, realtime OOS scoring, and latest-vintage 2025Q4 forecasting.
- Build vintage panels for FRED-QD and FRED-MD historical vintages.
- Build processed panels using FRED transform codes + MD outlier/trimming semantics (code zip + fbi).
- Build GDP release truth table from ALFRED and compute release-vintage q/q and q/q SAAR growth for first/second/third releases.
- Run fev evaluation for both processed and unprocessed panels, with different target objectives (LL vs G) and release-truth mappings.
- Produce a 2025Q4 one-shot forecast using latest FRED API pulls and processed-mode top models.
make setup
make download-historical
make panel-qd panel-md
make panel-qd-processed panel-md-processed
make build-gdp-releases
make eval-unprocessed-standard
make eval-processed-standard| Mode | Covariates | Training objective | KPI for comparison |
|---|---|---|---|
unprocessed |
Raw vintage covariates | log_level (LL) |
q/q SAAR real GDP growth vs ALFRED release truth |
processed |
Transform-code + MD outlier/trimming processed covariates | saar_growth (G) |
q/q SAAR real GDP growth vs BEA-verified ALFRED first-vintage truth (qoq_saar_growth_alfred_first_pct) |
Details: docs/data_processing.md
Core baselines and multivariate models include naive_last, mean, drift, ar4, auto_arima, auto_ets, theta, local_trend_ssm, random_forest, xgboost, factor_pca_qd, mixed_freq_dfm_md, bvar_minnesota_8, bvar_minnesota_20, bvar_minnesota_growth_8, bvar_minnesota_growth_20, chronos2, and opt-in LSTM variants lstm_univariate / lstm_multivariate (plus optional ensemble variants).
LSTM variants require PyTorch (torch>=2.2.0) and are intentionally opt-in (not part of default model sets).
python scripts/run_eval_processed.py --models lstm_univariate lstm_multivariate --num_windows 20Full model catalog: docs/models.md
By default, every rolling window trains on an as-of vintage (vintage-correct). Snapshot evaluation is blocked unless you explicitly pass --allow_snapshot_eval. For processed run_eval, release truth defaults to BEA-verified ALFRED q/q SAAR first-vintage growth from data/panels/gdpc1_releases_first_second_third.csv (qoq_saar_growth_alfred_first_pct) via --eval_release_metric alfred_qoq_saar --eval_release_stages first --target_transform saar_growth. ALFRED q/q non-SAAR truth remains available via --eval_release_metric alfred_qoq --target_transform qoq_growth, and realtime SAAR truth remains available via --eval_release_metric realtime_qoq_saar --target_transform saar_growth.
make fetch-latest && make process-latest && make latest-forecast-processed
make plot-2025q4data/historical/: downloaded FRED vintage archives and extracted CSVsdata/panels/: generated vintage panels and GDP release truth tabledata/latest/,data/processed/: latest API pulls and processed latest snapshotsresults/: evaluation outputs, leaderboards, realtime OOS metrics, and forecast plotsscripts/: core pipeline entrypointsscripts/dev/: non-core development utilitiesdocs/: deeper protocol/model/data notes
- FRED databases historical vintages: https://www.stlouisfed.org/research/economists/mccracken/fred-databases
- FRED databases code zip (transform codes + trimming/outliers): https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/fred-databases_code.zip?sc_lang=en&hash=82A2EEE1EF3498C0820EB2212531D895
- fbi library: https://github.com/cykbennie/fbi
- ALFRED: https://alfred.stlouisfed.org
- FRED API: https://api.stlouisfed.org/fred
- fev (Forecast EValuation library): https://github.com/autogluon/fev