Skip to content
View Siddharthsid12's full-sized avatar

Block or report Siddharthsid12

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Siddharthsid12/README.md

Hey, I'm Siddharth Gupta 👋

Data Engineer with 2+ years of experience building scalable data pipelines, cloud warehouses and analytics platforms.

I design and implement end-to-end data solutions — from raw ingestion to production-ready analytics — using modern tools like PySpark, Databricks, Delta Lake and Azure.


🔧 What I Work With

Languages        Python · SQL 
Processing       PySpark · Apache Spark · Databricks
Storage          Delta Lake · Azure Data Lake Gen2 · PostgreSQL
Cloud            Azure (ADLS, ADF, Synapse, Databricks) · AWS (S3, Glue, Lambda) · GCP
Modeling         Data Vault 2.0 · Star Schema · SCD Type 2 · Dimensional Modeling
Orchestration    Apache Airflow · Azure Data Factory
Quality          Great Expectations · Custom Validation Frameworks
CI/CD            Azure DevOps · GitHub Actions · Git · Docker
IaC              Terraform

📂 Featured Projects

End-to-end data lakehouse on Azure Databricks with Medallion Architecture (Bronze → Silver → Gold), Star Schema with SCD Type 2, automated data quality framework and full CI/CD pipeline on Azure DevOps.

Data Vault 2.0 warehouse with Hubs, Links, and Satellites from 3 source systems, hash-diff change detection, Star Schema marts with SCD Type 2, orchestrated by Apache Airflow, containerized with Docker.


🧠 What I'm Focused On

  • Building production-grade data warehouse architectures
  • Designing reliable pipelines with automated quality gates and full data lineage
  • Cloud-native data platforms on Azure and Databricks
  • Writing clean, testable, well-documented data engineering code

📫 Let's Connect

If you're working on data engineering challenges or looking for a collaborator, feel free to reach out.

LinkedIn Email

Pinned Loading

  1. azure-databricks-lakehouse azure-databricks-lakehouse Public

    End-to-end data lakehouse on Azure Databricks — Medallion Architecture, Star Schema, SCD Type 2, CI/CD, Data Quality Framework

    Python 1

  2. Claude-skills Claude-skills Public

    Claude Skill that actually helps

    1

  3. data-vault-warehouse data-vault-warehouse Public

    Python 1