The Taxi Demand Predictor is a machine learning application designed to predict taxi demand in a given area using historical data. This project leverages various data science and machine learning techniques to provide accurate demand forecasts, which can be beneficial for taxi companies and ride-sharing services.
- Data Ingestion: Fetches and processes historical taxi demand data.
- Machine Learning Models: Utilizes models such as LightGBM and XGBoost for demand prediction.
- Interactive Visualization: Provides a user-friendly interface using Streamlit for visualizing predictions and trends.
- Geospatial Analysis: Integrates geospatial data to enhance prediction accuracy based on location.
- Feature Store Integration: Uses Hopsworks for managing and serving features for model training and inference.
- Python: The primary programming language for data processing and model development.
- Streamlit: A framework for building interactive web applications.
- Pandas: For data manipulation and analysis.
- Scikit-learn: For implementing machine learning algorithms.
- LightGBM: For gradient boosting framework that uses tree-based learning algorithms.
- XGBoost: An optimized distributed gradient boosting library.
- Geopandas: For geospatial data processing.
- Hopsworks: For managing feature stores and serving features to models.
To set up the project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/taxi_demand_predictor.git cd taxi_demand_predictor -
Install Poetry if you haven't already:
curl -sSL https://install.python-poetry.org | python3 - -
Install the project dependencies:
poetry install
-
Create a
.envfile in the project root and add your HOPSWORKS_API_KEY:HOPSWORKS_API_KEY="your_api_key_here"
To run the application, use the following command:
poetry run streamlit run src/frontend.pyOpen your web browser and navigate to http://localhost:8501 to access the application.
Contributions are welcome! If you have suggestions for improvements or new features, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.