End-to-end SQL analytics project analyzing transactional retail data to uncover revenue trends, product concentration, and customer retention dynamics.
Dataset: ~1M transaction records
Database: PostgreSQL
- 21% of products generate 80% of total revenue (Product Pareto).
- 23% of customers contribute to 80% of total revenue (Customer Pareto).
- 72% of customers are repeat buyers.
- Repeat customers drive ~97% of total revenue.
- Removed cancelled invoices
- Filtered invalid quantities
- Converted date formats
- Created revenue column
- Monthly revenue
- MoM growth
- YoY growth
- Seasonal spike detection
- Revenue per SKU
- Cumulative revenue modeling
- Pareto concentration analysis
- Repeat customer rate
- Revenue contribution by segment
- Customer Pareto distribution
- CTEs
- Window Functions (LAG, ROW_NUMBER, SUM OVER)
- Cumulative calculations
- FILTER clause
- Date truncation
- Aggregations