This project implements Lasso Regression using scikit-learn to predict house prices from a housing dataset. Lasso Regression applies L1 regularization, which not only helps reduce overfitting but can also perform feature selection by shrinking some coefficients to zero.
The notebook demonstrates the complete machine learning workflow, including data loading, preprocessing, model training, evaluation, and residual analysis.
Lasso_Regression
│
├── Lasso_Regression.ipynb
├── housing.csv
├── residual_distribution.png
└── README.md
- File: housing.csv
- Type: Tabular housing data
- Purpose: Used to train and evaluate a Lasso Regression model for house price prediction
- Python
- NumPy
- Pandas
- Matplotlib
- scikit-learn
- Load the housing dataset
- Perform train-test split
- Train a Lasso Regression model
- Predict house prices on test data
- Evaluate model performance using R² Score
- Analyze residual distribution
R² Score: 0.6395660373503593
Interpretation:
The model explains approximately 64% of the variance in housing prices.
Lasso regularization helps simplify the model by reducing the impact of less
important features while maintaining competitive performance.
Residual Distribution (y_test − ridge_pred):
- Residuals are approximately normally distributed
- Indicates that regression assumptions are largely satisfied
- Feature sparsity introduced by Lasso improves interpretability
- Lasso Regression performs implicit feature selection
- Helps reduce model complexity
- Useful when dealing with high-dimensional feature spaces
- Clone the repository
git clone https://github.com/btboilerplate/Lasso_Regression.git
- Install required libraries
pip install numpy pandas matplotlib scikit-learn
- Open
Lasso_Regression.ipynb - Run all cells sequentially
- Compare Lasso vs Ridge vs ElasticNet
- Tune the alpha parameter using cross-validation
- Analyze selected vs eliminated features
