Skip to content

Latest commit

 

History

History
27 lines (26 loc) · 1.22 KB

File metadata and controls

27 lines (26 loc) · 1.22 KB

Python-SQL Data Analysis Pipeline

DESCRIPTION:

This project demonstrates a comprehensive data analysis pipeline using Python and SQL. 
It leverages the Kaggle API to fetch a retail orders dataset, performs data cleaning and transformation with Pandas,
and loads the processed data into SQL Server for in-depth analysis.

KEY FEATURES:

Data Acquisition:

Utilizes the Kaggle API to download the required dataset.

Data Cleaning and Preparation:

Employs Pandas to handle missing values, inconsistencies, and perform data transformations.

Data Loading:

Populates SQL Server with the cleaned and prepared data for efficient analysis.

Data Analysis: Executes SQL queries to address specific business questions, such as:

Identifying top-selling and revenue-generating products.
Analyzing regional sales trends.
Comparing sales performance across different time periods.
Evaluating profit growth for various product categories.

PREREQUISITES:

Python version above 4
Pandas
SQL Server
Kaggle API credentials

Feel free to contribute to this project by:

Enhancing the data analysis queries.
Adding new features or visualizations.
Improving the project's documentation.