Building reliable data infrastructure, one transformation at a time
I transform raw data into reliable insights. Currently specializing in modern data stack engineering with a focus on:
- Data Modeling β Dimensional modeling, slowly changing dimensions, data vault
- Pipeline Orchestration β Airflow DAGs, dependency management, error handling
- Analytics Engineering β dbt transformations, incremental materialization, data quality
- Cloud Infrastructure β AWS data services, infrastructure-as-code, cost optimization
Transitioning into production analytics engineering after completing a 1-year intensive data science certification and earning my AWS Cloud Practitioner certification. Building portfolio projects that solve real business problems with clean code and thoughtful architecture.
Production-Grade E-Commerce Analytics Platform
Building an end-to-end analytics pipeline that processes customer transaction data using the modern data stack.
Stack: dbt β’ PostgreSQL β’ Airflow β’ Python β’ Docker β’ Snowflake
Highlights: Incremental ETL, data quality testing, CI/CD automation
Technical Deep Dive
Architecture:
- Medallion architecture (Bronze β Silver β Gold layers)
- Incremental materialization for performance
- Great Expectations for data quality
- GitHub Actions for continuous deployment
Business Impact:
- Reduces data processing time by 80%
- Automated data quality checks catch 99% of issues
- Self-service analytics layer for stakeholders
|
|
π Detail-Oriented Engineering
I don't just make pipelines workβI make them maintainable, testable, and cost-efficient.
π§ Business-First Mindset
Every technical decision traces back to business value. Data engineering isn't just moving dataβit's enabling better decisions.
π Continuous Learner
From data science certification to cloud engineering to analytics engineeringβI'm always expanding my technical horizons.
π Global Perspective, Local Impact
Based in Nairobi, building skills that compete globally while looking to create impact locally.
def approach_to_data_engineering():
principles = {
"quality": "Test everything, twice",
"efficiency": "Automate the boring stuff",
"clarity": "Code is read more than written",
"impact": "Focus on business value"
}
return principlesgraph LR
A[Modern Data Stack] --> B[dbt Mastery]
A --> C[Airflow Expertise]
A --> D[Cloud Architecture]
B --> E[Production Projects]
C --> E
D --> E
E --> F[Analytics Engineering Role]
Currently Building:
- β Production-grade data pipelines
- β Data quality frameworks
- β CI/CD for analytics code
- π― Real-time streaming (next phase)
Last updated: January 2025
Currently:
- π¨ Building: Data Engineering and analytics engineering projects
- π Learning: Advanced data modeling patterns (Kimball methodology)
- π― Seeking: Data Engineer / Analytics Engineer roles
- π± Reading: "The Data Warehouse Toolkit" by Ralph Kimball
This Week:
- Implementing incremental loads in dbt
- Building Airflow DAGs for orchestration
- Networking with data engineers on LinkedIn
- Contributing to data engineering communities
"The best data pipeline is the one you don't have to think aboutβit just works, scales, and alerts you when it doesn't."
I believe in:
- Automation over manual work β If I do it twice, I automate it
- Documentation as code β Good docs prevent 3 AM debugging sessions
- Test-driven development β Catch bugs before they catch you
- Incremental improvement β Small wins compound into excellence
AWS Certified Cloud Practitioner β’ 2024
ALX Data Science Tech Programs β’ 1-Year Program β’ 2023-2024
When I'm not building data pipelines, I'm probably:
- β½ Training for a football tournament around Nairobi
- β Experimenting with pour-over coffee (yes, I track the extraction ratios in a spreadsheet)
- π Reading technical blogs and data engineering case studies
- βοΈ Playing chess online (data analysis extends to opening theory!)
I've written SQL queries that join 10+ tables without losing my sanity. My secret? CTEs, lots of CTEs.
I'm actively seeking Analytics Engineer or Junior Data Engineer roles where I can:
- Build scalable data infrastructure
- Work with modern data stack (dbt, Airflow, Snowflake, Databricks)
- Collaborate with data teams solving real problems
- Learn from experienced engineers
Reach out if you're:
- Hiring for analytics engineering roles
- Want to discuss data architecture
- Building something interesting in the data space
- Looking for collaboration on open-source data tools
π§ Email: otienoduncan99@gmail.com
πΌ LinkedIn: duncan-otieno
π Location: Nairobi, Kenya (Open to remote)
π Timezone: EAT (UTC+3)
+ Building production-grade projects
+ Networking with data engineering community
+ Actively seeking analytics engineering roles
! Available for opportunities - Let's build something great together"Data is the new electricity, and Engineers are the power grid. Keep Building, Keep Automating, Keep Scaling."
Last Updated: January 2025
