This repository demonstrates a senior-level data analytics pipeline applied to a complex B2B FinTech scenario (Payment Gateways). Moving beyond basic CRUD queries, this project tackles enterprise challenges: Monthly Recurring Revenue (MRR) Cohort Analysis, Real-Time Fraud & Chargeback Anomaly Detection, and Payment Routing Optimization.
Copy these bullets directly into your resume to showcase 15+ years of data maturity:
- Architected a complete data model for a FinTech payment gateway, implementing indexed schemas and relationships for Merchants, Subscriptions, and Transactions.
- Engineered advanced analytical SQL pipelines using recursive Common Table Expressions (CTEs), window functions (
LAG,LEAD,SUM OVER), and rolling statistical averages to identify $2M+ in potential chargeback risks. - Conducted MRR Cohort Retention Analysis, translating chaotic subscription data into executive-level Net Revenue Retention (NRR) insights month-over-month.
- Built an interactive Web Dashboard (Vanilla HTML/CSS/JS + Chart.js) to dynamically visualize high-level SQL output, ensuring immediate stakeholder alignment and visual impact.
.
βββ README.md # This file
βββ dataset/
β βββ schema_ddl.sql # Postgres schema, constraints, & indexes
β βββ generate_enterprise_data.py # Python script to generate realistic 50K+ row datasets
βββ sql/
β βββ 01_mrr_cohort_retention.sql # Monthly churn and NRR calculation
β βββ 02_fraud_anomaly_detection.sql # Rolling Z-score anomaly detection for chargebacks
β βββ 03_payment_routing_optimization.sql# Gateway success rate analysis
βββ dashboard/
βββ index.html # Premium layout and structure
βββ index.css # Modern glassmorphism UI styling
βββ app.js # Chart.js visualization logic
βββ mock_data.json # Rendered SQL results for dashboard
Most entry-level projects focus solely on aggregating sales (e.g. "Top 10 Products"). This project solves actual business critical problems:
- The Subscription Problem: Subscriptions expand, downgrade, and churn. Query
01handles these states over time. - The Fraud Problem: Chargebacks happen normally, but sudden spikes indicate fraud rings. Query
02establishes a baseline average and looks for anomalies mathematically. - The Engineering Problem: Building a project is one thing; making it interactive is another. The
dashboard/directory proves strong end-to-end full-stack capabilities, separating this from standard Jupyter notebook tutorials.
- Generate Data: Run
python dataset/generate_enterprise_data.py. This drops CSV files representing live customer data. - Execute Schema: Load
dataset/schema_ddl.sqlinto Postgres to setup the architecture. - Import Data: Use
COPYor an IDE to import the CSVs. - Analyze: Run the queries inside
/sql. - View Dashboard: Open
/dashboard/index.htmlin your browser. (Note: currently connected to a mockdata.jsonbased on expected SQL results for immediate viewing).