This project explores how player fatigue impacts NBA performance, specifically plus-minus, using historical game data and machine learning techniques. The goal is to evaluate whether workload and recovery related factors can help predict changes in on-court impact.
NBA players experience varying levels of fatigue due to travel, minutes played, and game scheduling. This project analyzes these factors to determine their relationship with player plus-minus, a commonly used performance metric.
Using PySpark and regression models, the notebook builds a pipeline that:
- Cleans and preprocesses NBA game data
- Engineers fatigue-related features
- Trains regression models to predict plus-minus
- Evaluates model performance
The emphasis is on understanding trends and seeing which features have the most impact in detecting fatigue.
- Python
- PySpark
- Pandas
- NumPy
- Google Colab
- Feature engineering focused on fatigue indicators such as workload and recovery windows
- Regression-based modeling for plus-minus prediction
- Model evaluation using standard regression metrics
- Exploratory analysis of fatigue effects on performance
NBA_Fatigue_Predictor.ipynb
Main notebook containing data processing, modeling, and evaluation
- Open the notebook in Google Colab or a local Jupyter environment with PySpark installed and run.
- Incorporate additional contextual features such as opponent strength
- Try classification models instead of regression
- Expand evaluation across multiple seasons