This repository contains a comprehensive data science workflow for predicting residential home prices.
🔗 Original Notebook on Kaggle: View here
- Kaggle Score (RMSE): 0.12676
- Model: Ridge Regression with hyperparameter tuning.
- Exploratory Data Analysis (EDA): Detailed visualization of target distribution and correlation matrices.
- Advanced Preprocessing: - Handling missing values based on feature context.
- Box-Cox transformation and Log-scaling for skewed numerical features.
- Categorical encoding for quality-related features.
- Modeling: Implementation of Regularized Linear Models (Ridge) to prevent overfitting given the high number of features (79+).
- Clone the repo:
git clone https://github.com/lucalullo/House-price.git - Install dependencies:
pip install pandas numpy seaborn matplotlib scikit-learn scipy - Run the
house-prices.ipynbnotebook.
Author: Luca Lullo
Data Scientist | Machine Learning Applied