Data Engineer with 2+ years of experience building scalable data pipelines, cloud warehouses and analytics platforms.
I design and implement end-to-end data solutions — from raw ingestion to production-ready analytics — using modern tools like PySpark, Databricks, Delta Lake and Azure.
Languages Python · SQL
Processing PySpark · Apache Spark · Databricks
Storage Delta Lake · Azure Data Lake Gen2 · PostgreSQL
Cloud Azure (ADLS, ADF, Synapse, Databricks) · AWS (S3, Glue, Lambda) · GCP
Modeling Data Vault 2.0 · Star Schema · SCD Type 2 · Dimensional Modeling
Orchestration Apache Airflow · Azure Data Factory
Quality Great Expectations · Custom Validation Frameworks
CI/CD Azure DevOps · GitHub Actions · Git · Docker
IaC Terraform
|
End-to-end data lakehouse on Azure Databricks with Medallion Architecture (Bronze → Silver → Gold), Star Schema with SCD Type 2, automated data quality framework and full CI/CD pipeline on Azure DevOps. |
Data Vault 2.0 warehouse with Hubs, Links, and Satellites from 3 source systems, hash-diff change detection, Star Schema marts with SCD Type 2, orchestrated by Apache Airflow, containerized with Docker. |
- Building production-grade data warehouse architectures
- Designing reliable pipelines with automated quality gates and full data lineage
- Cloud-native data platforms on Azure and Databricks
- Writing clean, testable, well-documented data engineering code
If you're working on data engineering challenges or looking for a collaborator, feel free to reach out.