This project builds an end-to-end machine learning pipeline to predict whether it will rain tomorrow in Australia, using historical weather data. The project includes:
- Data cleaning and preprocessing
- Feature engineering
- Model training with scikit-learn
- Model evaluation
- An interactive Streamlit web application for:
- Data inspection & cleaning preview
- Real-time rain prediction
- Modular project structure (no notebook dependency for training)
- Automated data cleaning
- Missing value handling using imputers
- Logistic Regression classification model
- Streamlit-based UI with:
- Data Cleaning & EDA page
- Rain Prediction page
- Reproducible training pipeline
- Production-ready model saving & loading
WeatherForecast/
├── data/
│ └── raw/
│ └── AUS_Weather.csv
├── models/
│ └── weather_logreg.joblib
├── notebooks/
│ └── exploration.ipynb
├── notebooks/
│ ├── exploration.ipynb
│ └── exploration2.ipynb
├── src/
│ ├── __init__.py
│ ├── config.py
│ ├── data.py
│ ├── features.py
│ ├── train.py
│ └── predict.py
├── app.py
├── requirements.txt
├── README.md
└── .gitignore
Algorithm: Logistic Regression
Target Variable: RainTomorrow
Selected Features:
MinTemp
MaxTemp
Humidity3pm
Pressure3pm
WindSpeed3pm
RainToday
Preprocessing:
Numerical features → Median imputation + StandardScaler
Categorical features → Mode imputation + OneHotEncoder
1️⃣ Install dependencies
pip install -r requirements.txt
2️⃣ Train the model
python -m src.train
This will generate:
models/weather_logreg.joblib
3️⃣ Run the Streamlit app
streamlit run app.py
✅ Data Cleaning & EDA
Displays raw dataset
Shows missing values
- Install dependencies:
pip install -r requirements.txt
- Run the model script:
python src/exploration.ipynb
- Output
Displays cleaned dataset preview
✅ Prediction Page
User inputs weather conditions
Outputs:
Rain / No Rain
Prediction probability
Batch predictions via CSV upload
Advanced models (XGBoost, Random Forest)
Model performance visualizations
Deployment on Streamlit Cloud
This project is open-source and free to use for learning and experimentation.