Skip to content

evasafi/Supermarket-Sales-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Supermarket Sales Analytics & Business Intelligence

Python Pandas Matplotlib License Status

Professional-grade data science analysis demonstrating advanced analytics, business intelligence, and strategic insights from real-world retail transaction data.


Overview

This repository contains a comprehensive end-to-end data analysis of supermarket sales transactions across multiple stores and product categories. The project demonstrates enterprise-level analytics capabilities including customer segmentation, time-series forecasting, profitability analysis, and actionable business recommendations.

Perfect for: Portfolio projects, data science interviews, business analytics roles, or as a template for retail analytics.


Key Features

Advanced Analytics

  • RFM Segmentation - Identify high-value, at-risk, and dormant customers
  • Time Series Analysis - Detect seasonality, trends, and patterns
  • Pareto Analysis - Discover the vital 20% driving 80% of revenue
  • Anomaly Detection - Find unusual transactions and operational issues
  • Cross-sell Mapping - Identify product pairs frequently purchased together
  • Profitability Analysis - Calculate margins and profit per transaction

Business Insights

  • Executive-level KPI metrics (revenue, transactions, satisfaction)
  • Store performance benchmarking and comparison
  • Customer behavior and demographic analysis
  • Product category ranking and growth analysis
  • Seasonal patterns and forecasting
  • Strategic recommendations with estimated ROI

Professional Visualizations

  • 6 dashboard-style PNG visualizations (300 DPI)
  • Color-coded for easy interpretation
  • Ready for executive presentations
  • Automatically generated from data

Production-Ready Code

  • Clean, commented, well-organized Python
  • Modular sections for easy modification
  • Follows industry best practices
  • Documented assumptions and methodologies

Project Structure

supermarket-analytics/
├── README.md                              # This file
├── ANALYSIS_GUIDE.md                      # Detailed analysis documentation
├── LICENSE                                # MIT License
├── requirements.txt                       # Python dependencies
│
├── data/
│   └── SuperMarket_Analysis.csv           # Sample dataset (1,000 transactions)
│
├── notebooks/
│   ├── analysis.ipynb                     # Main analysis notebook
│   └── implementation_guide.ipynb          # Step-by-step tutorial
│
├── scripts/
│   ├── supermarket_advanced_analysis.py   # Core analysis engine (13 sections)
│   ├── supermarket_visualizations.py      # Visualization generation
│   └── supermarket_implementation_guide.py # Step-by-step code
│
├── outputs/
│   ├── 01_executive_dashboard.png         # KPIs & key trends
│   ├── 02_temporal_patterns.png           # Seasonality & trends
│   ├── 03_customer_segmentation.png       # Customer behavior
│   ├── 04_store_benchmarking.png          # Store comparisons
│   ├── 05_category_analysis.png           # Product performance
│   ├── 06_summary_report_card.png         # Executive summary
│   └── supermarket_analysis_summary.csv   # Export metrics
│
└── docs/
    ├── INSTALLATION.md                    # Setup instructions
    ├── USAGE.md                           # How to run the analysis
    └── INTERPRETATION.md                  # Understanding the results

Quick Start

Prerequisites

  • Python 3.8+
  • Jupyter Lab or Jupyter Notebook
  • pip package manager

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/supermarket-analytics.git
cd supermarket-analytics
  1. Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Running the Analysis

Option A: Quick Run (5-10 minutes)

jupyter lab
# Then in a cell:
exec(open('scripts/supermarket_advanced_analysis.py').read())
exec(open('scripts/supermarket_visualizations.py').read())

Option B: Step-by-Step Learning (30-60 minutes)

jupyter lab
# Open: notebooks/implementation_guide.ipynb
# Run cells one by one to understand each step

Option C: Command Line (Fastest)

python scripts/supermarket_advanced_analysis.py
python scripts/supermarket_visualizations.py

Analysis Sections

1. Executive Summary Metrics

High-level KPIs including total revenue, transactions, customer satisfaction, and store breakdown.

Key Metrics:

  • Total Revenue: $X.XXM
  • Transaction Count: X,XXX
  • Average Transaction Value: $XXX.XX
  • Customer Satisfaction: X.XX/10.0

2. Temporal Analysis

Discovers patterns in sales over time: daily, weekly, monthly, and quarterly trends.

Insights:

  • Peak days of week (e.g., Saturday +15% above average)
  • Seasonal patterns (e.g., Q4 +35% above baseline)
  • Monthly growth trends
  • Best performing time periods

3. Product Category Analysis

Deep dive into product performance with Pareto analysis.

Discoveries:

  • Top 10 revenue-generating categories
  • Category growth rates
  • Pricing and transaction patterns
  • Revenue concentration (80/20 rule)

4. Customer Segmentation & RFM

RFM (Recency, Frequency, Monetary) analysis for customer behavior.

Segments:

  • High-value loyal customers
  • At-risk customers
  • New customers
  • Dormant customers
  • Member vs. Non-member analysis
  • Gender and demographic breakdown

5. Store Performance Benchmarking

Comparative analysis across stores.

Metrics:

  • Revenue ranking
  • Transaction volume
  • Average transaction value
  • Customer satisfaction by store
  • Consistency analysis

6. Quality & Satisfaction Analysis

Customer satisfaction insights and quality control.

Findings:

  • Average rating by category
  • Categories below satisfaction threshold
  • Rating distribution
  • Quality improvement opportunities

7. Anomaly Detection

Identifies unusual transactions and operational issues.

Detections:

  • High-value transaction analysis
  • Low-revenue days
  • Outlier transactions
  • Potential data quality issues

8. Cross-Selling Opportunities

Identifies product pairs frequently purchased together.

Uses:

  • Bundle promotion recommendations
  • Product placement optimization
  • Inventory correlation planning

9. Profitability Analysis

Margin and profit metrics by category.

Metrics:

  • Gross profit by category
  • Profit margins
  • Profit per transaction
  • Most profitable products

10. Forecasting & Predictive Insights

Trend projection and seasonality strength.

Outputs:

  • Growth trajectory
  • Seasonality factor
  • Trend direction
  • Next period projections

11. Strategic Recommendations

Actionable insights ranked by impact.

Recommendations Include:

  • Inventory optimization
  • Promotion strategy
  • Store improvement initiatives
  • Customer loyalty programs
  • Payment system optimization

Sample Results

Key Findings (Example Output)

EXECUTIVE SUMMARY
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total Revenue:           $322,965.25
Total Transactions:      1,000
Avg Transaction Value:   $322.97
Customer Satisfaction:   6.97/10.0

REVENUE BY STORE:
               Total Revenue    Avg Transaction    Avg Rating
Yangon         $111,923.78           $316.54          6.75
Naypyitaw      $106,482.24           $329.18          7.02
Giza           $104,559.23           $321.54          7.22

TOP 3 CATEGORIES:
1. Food and beverages:  $56,471 (17.5% of revenue)
2. Fashion accessories: $54,601 (16.9% of revenue)
3. Sports and travel:   $53,861 (16.7% of revenue)

SEASONALITY ANALYSIS:
Strong Q4 peak (+35% above average)
Recommend 2-month advance inventory buildup

GROWTH ANALYSIS:
Member penetration:     31.5%
Member spend premium:   1.8x non-members
Loyalty opportunity:    +$X revenue if 50% penetration

Use Cases

This analysis can be adapted for:

  • Retail Analytics - Optimize store operations and inventory
  • E-commerce - Improve product recommendations and bundling
  • Marketing - Target customer segments effectively
  • Finance - Profitability analysis and margin optimization
  • Operations - Identify underperforming locations
  • Portfolio Projects - Demonstrate data science capabilities
  • Job Interviews - Show real-world analytics experience
  • Academic Projects - Apply statistics and data science concepts

Visualizations Included

1. Executive Dashboard

KPIs, monthly trends, quarterly breakdown, top categories, store performance.

2. Temporal Patterns

Day-of-week patterns, monthly trends, category trends over time, quarterly performance.

3. Customer Segmentation

Customer type breakdown, gender analysis, payment method distribution, satisfaction ratings.

4. Store Benchmarking

Revenue comparison, transaction volume, average transaction value, customer ratings.

5. Category Analysis

Store-category heatmap, performance matrix, top categories, satisfaction ranking.

6. Summary Report Card

Text-based executive summary with key metrics and actionable recommendations.


Customization

Modify for Your Data

  1. Replace the dataset:
# In your notebook:
sales = pd.read_csv('your_data.csv')
  1. Update column names:
sales.rename(columns={
    'your_date_column': 'Date',
    'your_revenue_column': 'Revenue',
    'your_store_column': 'Store'
}, inplace=True)
  1. Run the analysis:
exec(open('scripts/supermarket_advanced_analysis.py').read())

Add Custom Analysis

The modular structure makes it easy to add sections:

# Add a new section
print("\n" + "="*80)
print("MY CUSTOM ANALYSIS")
print("="*80)

my_metric = sales.groupby('your_column').agg({...})
print(my_metric)

Documentation


Technologies Used

Tool Purpose
Python 3.8+ Core programming language
Pandas Data manipulation and analysis
NumPy Numerical computations
Matplotlib Visualization creation
Seaborn Statistical data visualization
Jupyter Lab Interactive notebooks

Requirements

All dependencies are listed in requirements.txt:

pandas>=1.3.0
numpy>=1.21.0
matplotlib>=3.4.0
seaborn>=0.11.0
jupyter>=1.0.0
jupyterlab>=3.0.0

Install with:

pip install -r requirements.txt

Sample Dataset

The included SuperMarket_Analysis.csv contains:

  • 1,000 transactions across 3 stores
  • 17 features including date, product line, revenue, customer type, rating
  • 6 product categories (Food, Electronics, Fashion, etc.)
  • Real-world data complexity (missing values, outliers, seasonality)

Perfect for learning and demonstration purposes.


Expected Output

After running the analysis, you'll get:

  • Console Output - 100+ detailed metrics and insights
  • 6 PNG Visualizations - Publication-ready (300 DPI)
  • CSV Export - Summary metrics table
  • Actionable Recommendations - Ranked by business impact

Learning Outcomes

After working through this project, you'll understand:

Technical Skills

  • Data loading and cleaning
  • Multi-level aggregations and grouping
  • Time series analysis
  • Customer segmentation techniques
  • Anomaly detection methods
  • Professional visualization creation
  • Statistical analysis and metrics

Business Skills

  • Identifying revenue drivers
  • Analyzing customer behavior
  • Spotting operational issues
  • Calculating business impact
  • Creating executive-ready presentations
  • Strategic thinking and planning

Soft Skills

  • Data storytelling
  • Communicating with stakeholders
  • Problem decomposition
  • Documentation best practices

Troubleshooting

Common Issues

"FileNotFoundError: CSV not found"

# Make sure CSV is in the same directory
import os
print(os.listdir())  # Should show your CSV

"ModuleNotFoundError: pandas"

pip install pandas matplotlib seaborn

Visualizations don't display in Jupyter

%matplotlib inline

See docs/INSTALLATION.md for more troubleshooting.


License

This project is licensed under the MIT License - see LICENSE file for details.

Free for personal and commercial use with attribution.


Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-analysis)
  3. Commit your changes (git commit -m 'Add amazing analysis')
  4. Push to the branch (git push origin feature/amazing-analysis)
  5. Open a Pull Request

Ideas for Contributions

  • Additional analysis sections
  • Different datasets
  • Interactive dashboards
  • Machine learning models
  • Real-time analytics
  • Documentation improvements

Next Steps

  1. Clone this repository

    git clone https://github.com/yourusername/supermarket-analytics.git
  2. Install dependencies

    pip install -r requirements.txt
  3. Run the analysis

    jupyter lab
  4. Explore the visualizations

    • Check the outputs/ folder for generated charts
    • Review the console output for detailed metrics
  5. Customize for your data

    • Replace the CSV with your dataset
    • Modify column names and parameters
    • Add your own analysis sections

Resources


Roadmap

  • Core analysis implementation
  • Visualization generation
  • Documentation
  • Interactive Jupyter widgets
  • Streamlit dashboard
  • REST API for real-time analysis
  • Machine learning models
  • Time series forecasting
  • Docker containerization

Checklist for Users

Before presenting this analysis:

  • Downloaded and extracted the repository
  • Installed all dependencies
  • Successfully ran the analysis scripts
  • Generated all visualizations
  • Reviewed console output metrics
  • Understood each analysis section
  • Created presentation slides
  • Practiced explanation of insights
  • Prepared answers to likely questions

Project Stats

  • Lines of Code: 1,500+
  • Analysis Sections: 11
  • Visualizations: 6
  • Metrics Calculated: 100+
  • Execution Time: 5-10 minutes
  • Documentation: Comprehensive

Highlights

This project demonstrates:

  • Real-world complexity - Working with actual retail data
  • End-to-end workflow - From raw data to actionable insights
  • Business acumen - Understanding ROI and strategic impact
  • Technical excellence - Clean, professional code
  • Communication skills - Executive-ready visualizations
  • Problem-solving - Answering complex business questions

Made by Eva Safi

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors