SVR-ARIMA

Standalone ARIMA forecasting implementation for the Sentiment–Volatility Ratio, developed from the data science capstone project:

Predictive Models for the Diagnostic Ratio of Consumer Sentiment and Volatility

This repository contains the ARIMA modeling workflow only. The broader capstone compared AR, ARIMA, MLP, and LSTM models, but this repository preserves and documents the ARIMA implementation separately for review, reproducibility, and future cleanup.

Project Overview

The project models a diagnostic ratio constructed from two macro-financial indicators:

UMCSENT: University of Michigan Consumer Sentiment Index
VIXCLS: CBOE Volatility Index

The modeled series is the log-transformed Sentiment–Volatility Ratio:

log(SVR) = log(UMCSENT) - log(monthly mean VIXCLS)

The goal is to evaluate manually specified ARIMA models for forecasting the transformed ratio over held-out test observations.

Data Sources

The script retrieves data from FRED using pipewelder::get_fred().

The source series are:

UMCSENT, monthly consumer sentiment
VIXCLS, daily VIX close values

Daily VIX observations are aggregated to monthly averages before the ratio is constructed.

Modeling Window

The intended modeling window is:

January 1990 through December 2025

The workflow uses:

432 monthly observations
348 training observations
84 held-out test observations

The split is chronological. No random train/test split is used.

Forecasting Design

The ARIMA workflow evaluates forecasts for:

h = 1, one month ahead
h = 3, three months ahead

The script uses a leakage-safe expanding-window rolling forecast setup.

For each target observation in the test period:

the forecast target is identified;
the forecast origin is set to target - h;
the model is fit only on observations available through the forecast origin;
an h-step forecast is generated;
the h-th forecast value is aligned with the target observation;
the residual is calculated as actual minus forecast.

This means the model is repeatedly refit as the forecast origin advances through the test period.

ARIMA Grid

The script evaluates manually specified ARIMA models using the forecast package.

Candidate values are:

p = 1, 2, 3
d = 0, 1
q = 1, 2, 3

This produces 18 ARIMA specifications.

Models with d = 0 are estimated with a mean term.

Models with d = 1 are estimated without a mean term.

Model identifiers use the compact format:

ARIMApdq

For example:

ARIMA103 = ARIMA(1, 0, 3)
ARIMA213 = ARIMA(2, 1, 3)

Evaluation Metrics

Forecast accuracy is evaluated on the held-out test period using:

MSE, mean squared error
RMSE, root mean squared error
MAE, mean absolute error

MAE is the primary metric used for comparing model performance.

Metrics are calculated on the log-transformed SVR scale.

Output

The script produces a wide-format metrics table with one row per model and forecast horizon.

The final table includes:

model_id
horizon
test_mse
test_rmse
test_mae

The script also exports the results to:

ARIMA Metrics FINAL.csv

Repository Scope

This repository is intentionally limited to the ARIMA workflow.

It does not include:

AR model implementation
MLP model implementation
LSTM model implementation
Shiny dashboard code
deployment configuration
website integration
automated model refresh
production forecasting services

The current goal is to preserve, document, and validate the original ARIMA research workflow before making larger cleanup or refactoring decisions.

Reproducibility Notes

Exact numeric reproduction may depend on:

current FRED data availability
source data revisions
R version
package versions
forecast::Arima() behavior
numerical optimization behavior

The purpose of this repository is to preserve the ARIMA algorithm and evaluation design clearly. If results differ from historical capstone tables, those differences should be documented rather than hidden.

Main Dependencies

The script uses:

library(pipewelder)
library(tidyverse)
library(lubridate)
library(forecast)

How to Run

Open the project in RStudio or another R environment and run the ARIMA script.

The script performs the full workflow:

retrieve raw FRED data;
aggregate daily VIX data to monthly averages;
construct the log-transformed Sentiment–Volatility Ratio;
create the chronological train/test split;
define the ARIMA parameter grid;
generate rolling forecasts for each model and horizon;
calculate forecast accuracy metrics;
reshape the results table;
export the final CSV.

License

See the repository license for usage terms.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SVR-ARIMA.R		SVR-ARIMA.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SVR-ARIMA

Project Overview

Data Sources

Modeling Window

Forecasting Design

ARIMA Grid

Evaluation Metrics

Output

Repository Scope

Reproducibility Notes

Main Dependencies

How to Run

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SVR-ARIMA

Project Overview

Data Sources

Modeling Window

Forecasting Design

ARIMA Grid

Evaluation Metrics

Output

Repository Scope

Reproducibility Notes

Main Dependencies

How to Run

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages