A data pipeline project that extracts weather data from the OpenWeather API and transforms it using dbt (data build tool) with PostgreSQL as the data warehouse.
This project consists of two main components:
- Data Pipeline: Python scripts for extracting weather data from OpenWeather API
- dbt Transformations: Data modeling and transformation using dbt
- Python 3.12+
- PostgreSQL database
- OpenWeather API key
Install dependencies using uv (recommended) or pip:
# Using uv
uv sync
# Or using pip
pip install -r requirements.txtSet up your PostgreSQL database and update the connection details in weather/scripts/.env:
OPENWEATHER_API_KEY=your_api_key_here
POSTGRES_DB=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_HOST=localhost
POSTGRES_PORT=5432The dbt project is configured in weather/dbt_project.yml with:
- Staging models: Materialized as views in the
stagingschema - Marts models: Materialized as incremental tables in the
martsschema
- Extract weather data using the Python pipeline:
cd weather/scripts
uv run data_pipeline.pyNavigate to the weather directory and run dbt commands:
cd weather
# Install dbt packages
dbt deps
# Run all models
dbt run
# Run tests
dbt test
# Generate and serve documentation
dbt docs generate
dbt docs serveData Pipeline (weather/scripts/)
- Extracts weather data from OpenWeather API
- Loads raw data into PostgreSQL
dbt Models (weather/models/)
- Sources: Raw data source definitions
- Staging: Data cleaning and initial transformations
- Marts: Business logic and final data models
Macros (weather/macros/)
convert_to_local.sql: Custom macro for timezone conversionget_custom_schema.sql: Custom schema naming logic
- Create new model files in the appropriate subdirectory of
weather/models/ - Add tests in
weather/tests/or inline in model files - Run
dbt runanddbt testto validate
# Clean generated files
dbt clean
# Compile models without running
dbt compile
# Check for schema changes
dbt run-operation check_for_schema_changesThis project is licensed under the MIT License.
