Small project for predicting earthquake damage grades of buildings.
This is based on the DrivenData training competition:
https://www.drivendata.org/competitions/57/nepal-earthquake/
| Model | Score |
|---|---|
| Competition top leaderboard score | 0.7558 |
| Best score achieved in this repository | 0.727 |
The repository includes:
- Feature engineering in Python
- Random Forest baseline notebook
- XGBoost training and tuning notebooks
feature_engineering.py02-feature_eng.ipynb03-rand_forest_classifier.ipynb04-xgb_classifier.ipynb07-simple_xgb-tuning.ipynbrequirements.txt
Put competition training files in data/:
data/train_values.csvdata/train_labels.csv
The script merges them on building_id.
- Create and activate a virtual environment.
- Install dependencies:
pip install -r requirements.txt- Start Jupyter:
jupyter notebookMain libraries: pandas, scikit-learn, xgboost, optuna, hyperopt, mlflow, jupyter.