Skip to content

RashadHummatov85/Data_weather_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Weather Project

A real-time weather data pipeline that extracts live weather data from an API, transforms it using dbt, stores it in PostgreSQL, and visualizes it in Apache Superset — all orchestrated by Apache Airflow and containerized with Docker.


Architecture

Screenshot 2026-01-06 054746
Live Data API --> Extract (Python) --> Transform (dbt) --> Load (PostgreSQL) --> Report (Superset)
                                            ^                    ^
                                            |                    |
                                   Orchestrate & Automate (Apache Airflow)
                                                Containerize (Docker)

Pipeline steps:

  1. Extract — Python script fetches current weather data for Baku from the Weatherstack API
  2. Transform — dbt cleans, deduplicates, and models the raw data into staging and mart layers
  3. Load — Transformed data is stored in PostgreSQL
  4. Report — Apache Superset visualizes the weather metrics (temperature, wind speed, visibility, feels like)
  5. Orchestrate — Apache Airflow runs the pipeline every 50 minutes automatically
  6. Containerize — The entire stack runs with Docker Compose

Project Structure

data_weather_project/
├── api_request/
│   ├── api_request.py        # Fetches weather data from Weatherstack API
│   └── insert_records.py     # Connects to PostgreSQL and inserts records
├── airflow/
│   └── dags/
│       └── orchestrator.py   # Airflow DAG: fetch → transform (every 50 min)
├── dbt/
│   ├── profiles.yml
│   └── my_project/
│       └── models/
│           ├── sources/       # Raw source definition
│           ├── staging/       # stg_weather_data (deduplicated)
│           └── mart/          # daily_average, weather_report
├── docker/
│   ├── docker-bootstrap.sh
│   ├── docker-init.sh
│   ├── superset_config.py
│   └── .env
├── postgres/
│   ├── airflow_init.sql
│   └── superset_init.sql
└── docker-compose.yaml

Tech Stack

Tool Purpose
Python Extract weather data from API
PostgreSQL Store raw and transformed data
dbt Data transformation and modeling
Apache Airflow Pipeline orchestration & scheduling
Apache Superset Data visualization & dashboards
Docker & Docker Compose Containerization of all services

dbt Models

Model Type Description
stg_weather_data Table Deduplicates raw records, converts UTC offset
daily_average Table Daily average temperature and wind speed per city
weather_report Table Clean weather report with all key metrics

Fields tracked: temperature, feelslike, visibility, wind_speed, weather_descriptions, city, time


Services & Ports

Service Port
PostgreSQL 5000
Apache Airflow 8000
Apache Superset 8088
Redis 6379

Getting Started

Prerequisites

Run the stack

git clone https://github.com/RashadHummatov85/Data_weather_project.git
cd Data_weather_project

Add your API key to api_request/api_request.py:

api_key = "your_weatherstack_api_key"

Start all services:

docker-compose up -d

Access the UIs:


Visualization

The Superset dashboard tracks real-time weather metrics for Baku:

  • Average Temperature
  • Average Wind Speed
  • Sum of Visibility
  • Feels Like temperature
9645c5c6-f5bc-41ce-a9ca-5840096f15cd

About

An automated ELT pipeline from ingesting live data from an API to creating a near-realtime dashboard, using Windows WSL Ubuntu 24 and Docker images of Apache Airflow 3.0 (the latest), dbt, Postgres, and Apache Superset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors