This project focused on designing and implementing a machine learning model to predict Airbnb listing prices across New York City. I conducted end-to-end data preparation, analysis, and model optimization using real-world datasets to understand how features such as location, room type, and reviews influence pricing. The outcome demonstrated improved predictive accuracy and provided actionable insights for optimizing short-term rental pricing strategies.
The details:
-
Performed data preprocessing and feature engineering on 50K+ Airbnb listings, addressing missing values, outliers, and categorical encoding to ensure model readiness.
-
Explored and visualized pricing patterns using Pandas, Matplotlib, and Seaborn to identify correlations between location, availability, and user ratings.
-
Trained and compared regression-based models (Linear Regression, Decision Trees, Neural Networks) using Scikit-learn and TensorFlow, optimizing parameters for best fit.
-
Evaluated performance through metrics such as Mean Absolute Error and R², achieving a model accuracy improvement of over 10% after feature refinement.
-
Documented insights and results in Jupyter Notebook, highlighting interpretability and reproducibility of methods for scalable data science applications.