Skip to content

virbahu/scope3-gnn-mapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 scope3-gnn-mapper

Python 3.9+ PyTorch PyG License: MIT GHG Protocol Google Scholar

Graph Neural Network framework for mapping, attributing, and predicting Scope 3 GHG emissions across multi-tier supply chains.


πŸ“‹ Overview

scope3-gnn-mapper is an open-source Python framework that applies Graph Neural Networks (GNNs) to the Scope 3 supply chain emissions mapping challenge.

Traditional Scope 3 accounting relies on spend-based estimates, supplier surveys, and static IO tables β€” methods that introduce systematic errors of 40–70% and fail to capture the dynamic, multi-tier nature of modern supply chains.

This framework addresses these limitations by:

  • Representing supplier relationships as a heterogeneous graph where nodes are companies/facilities and edges are procurement flows
    • Using Graph Attention Networks (GAT) and GraphSAGE to propagate emission intensities through the supply chain graph
      • Enabling uncertainty quantification at each node via Monte Carlo Dropout or Bayesian GNN layers
        • Providing automated hotspot detection to identify high-emission upstream tiers

          • Supporting what-if scenario analysis for supplier switching and decarbonization roadmaps

πŸ–ΌοΈ System Overview

Scope 3 Supply Chain Mapping

Multi-tier supplier graph showing emission intensity propagation across Tier 1–4 suppliers. Node size = annual spend; node color = kgCO2e/USD intensity.


πŸ—οΈ Architecture Diagram

╔══════════════════════════════════════════════════════════════════╗
β•‘               SCOPE 3 GNN MAPPER β€” SYSTEM ARCHITECTURE          β•‘
╠══════════════════════════════════════════════════════════════════╣
β•‘                                                                  β•‘
β•‘  [Raw Supply Chain Data]  [Emission Factor DBs]  [IoT Sensors]  β•‘
β•‘    CSV / ERP / REST API   EXIOBASE / ECOINVENT     MQTT/REST    β•‘
β•‘           β”‚                      β”‚                     β”‚         β•‘
β•‘           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β•‘
β•‘                                  β”‚                               β•‘
β•‘                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β•‘
β•‘                    β”‚  Graph Construction Layer β”‚                  β•‘
β•‘                    β”‚  SupplyChainGraphBuilder  β”‚                  β•‘
β•‘                    β”‚  β€’ Node: Company/Facility β”‚                  β•‘
β•‘                    β”‚  β€’ Edge: Procurement Flow β”‚                  β•‘
β•‘                    β”‚  β€’ HeteroData (PyG)       β”‚                  β•‘
β•‘                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β•‘
β•‘                                  β”‚                               β•‘
β•‘         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β•‘
β•‘         β”‚                        β”‚                  β”‚            β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β•‘
β•‘  β”‚  GAT Mapper  β”‚   β”‚  GraphSAGE (Inductive) β”‚ β”‚ Bayesian GNN β”‚  β•‘
β•‘  β”‚  8 HeadsΓ—4L  β”‚   β”‚  Multi-hop Aggregation β”‚ β”‚ Uncertainty  β”‚  β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β•‘
β•‘         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β•‘
β•‘                                  β”‚                               β•‘
β•‘                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β•‘
β•‘                    β”‚  Emission Attribution     β”‚                  β•‘
β•‘                    β”‚  β€’ Scope 3 Cat 1–15       β”‚                  β•‘
β•‘                    β”‚  β€’ Tier attribution       β”‚                  β•‘
β•‘                    β”‚  β€’ Confidence intervals   β”‚                  β•‘
β•‘                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β•‘
β•‘                                  β”‚                               β•‘
β•‘        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β•‘
β•‘        β”‚                         β”‚                     β”‚         β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β•‘
β•‘  β”‚  Hotspot   β”‚       β”‚ Scenario Analysis β”‚   β”‚  Dashboard    β”‚  β•‘
β•‘  β”‚  Detection β”‚       β”‚ Supplier Switchingβ”‚   β”‚  Streamlit +  β”‚  β•‘
β•‘  β”‚  & Ranking β”‚       β”‚ Decarbonization   β”‚   β”‚  FastAPI      β”‚  β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

GNN Emission Propagation

GNN Architecture


❗ Problem Statement

The Invisible 90% Problem

Scope 3 emissions represent on average 70–90% of total corporate GHG footprint, yet remain the most poorly measured category in enterprise sustainability reporting.

Challenge Traditional Approach GNN Solution
Data Gaps Spend-based estimates Β±40% error Graph imputation from neighboring nodes
Multi-Tier Blindness Tier 1 coverage only N-tier propagation through graph
Static Data Annual surveys (12-month lag) Real-time IoT + procurement feeds
Attribution Errors Manual category mapping Automated GHG Protocol classification
Uncertainty Point estimates only Full posterior distributions per supplier
Scalability 50–100 suppliers max 100,000+ nodes via mini-batch training

"A supplier network is a graph. Emission factors flow through procurement edges. GNNs were built to reason over exactly this structure."


βœ… Solution Overview

3-Phase Emission Intelligence Pipeline

Phase 1 β€” Graph Construction

Raw supplier CSV and procurement flow data is ingested, normalized, and assembled into a PyTorch Geometric HeteroData object. Each supplier becomes a node with a 96-dimensional feature vector encoding industry codes, country of origin, revenue band, and matched emission factor database entries. Each procurement relationship becomes a directed edge with spend, volume, and commodity metadata.

Phase 2 β€” GNN-Based Emission Attribution

The GATEmissionMapper runs 4 attention layers with 8 heads each, progressively aggregating neighborhood information to estimate per-supplier emission intensities for all 15 GHG Protocol Scope 3 categories. Bayesian dropout layers provide posterior uncertainty estimates without requiring full Bayesian inference.

Phase 3 β€” Hotspot Detection and Scenario Analysis

An R-Value scoring algorithm ranks suppliers by reduction potential per unit effort. The Scenario Engine models supplier switching, technology adoption, and procurement changes to generate science-aligned decarbonization roadmaps with quantified abatement potential.


πŸ’» Code, Installation & Analysis

Prerequisites

Requirement Version
Python 3.9+
CUDA (optional) 11.8+
RAM 16 GB minimum (32 GB recommended)
Disk Space 5 GB (emission factor databases)

Installation

# Clone the repository
git clone https://github.com/virbahu/scope3-gnn-mapper.git
cd scope3-gnn-mapper

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
# .venv\Scripts\activate    # Windows

# Install with all dependencies
pip install -e ".[dev]"

# Download emission factor databases
python scripts/download_emission_factors.py --databases exiobase3,ecoinvent38

Quick Start

from scope3_gnn.data.graph_builder import SupplyChainGraphBuilder
from scope3_gnn.models.gat_mapper import GATEmissionMapper
from scope3_gnn.training.trainer import Scope3Trainer
from scope3_gnn.analysis.hotspot import HotspotDetector

# 1. Build supply chain graph
builder = SupplyChainGraphBuilder(emission_factor_db="exiobase3")
graph = builder.from_csv(
    suppliers_path="data/raw/suppliers.csv",
    flows_path="data/raw/flows.csv"
)
print(f"Graph: {graph.num_nodes} nodes, {graph.num_edges} edges")

# 2. Train GNN model
model = GATEmissionMapper(
    in_channels=96, hidden_channels=256, num_layers=4, heads=8
)
trainer = Scope3Trainer(model=model, graph=graph, epochs=200)
trainer.fit()

# 3. Attribution and hotspot analysis
results = model.attribute_emissions(graph)
detector = HotspotDetector(top_k=20)
hotspots = detector.detect(results, graph)
hotspots.plot_sankey()

Sample Output

SCOPE 3 EMISSION ATTRIBUTION REPORT β€” 2026
──────────────────────────────────────────────────────────────
 Category                    tCO2e/year    % Total   Confidence
──────────────────────────────────────────────────────────────
 Cat 1: Purchased Goods       45,823        34.2%      Β±6.1%
 Cat 4: Upstream Transport    12,104         9.0%      Β±8.3%
 Cat 11: Use of Sold Prods    38,912        29.0%     Β±11.2%
 Cat 12: End-of-Life          14,332        10.7%      Β±9.4%
 Other Categories (11)        23,057        17.2%     Β±14.1%
──────────────────────────────────────────────────────────────
 TOTAL SCOPE 3               134,228       100.0%      Β±7.8%
──────────────────────────────────────────────────────────────
 Graph Coverage: 847 direct suppliers | Model Accuracy: 91.3%

πŸ“¦ Dependencies

[tool.poetry.dependencies]
python = "^3.9"
torch = "^2.0"
torch-geometric = "^2.3"
torch-scatter = "*"
torch-sparse = "*"
networkx = "^3.1"
neo4j = "^5.0"
pandas = "^2.0"
numpy = "^1.24"
scikit-learn = "^1.3"
fastapi = "^0.104"
streamlit = "^1.28"
plotly = "^5.17"
pydantic = "^2.0"

Emission Factor Databases

Database Coverage Frequency
EXIOBASE 3 44 countries, 163 sectors Annual
ecoinvent 3.8 16,000+ processes Annual
GHG Protocol 300+ factors Quarterly
EPA EEIO US industry IO tables Annual
GLEC Framework Transport factors Bi-annual

πŸ‘€ Author

Virbahu Jain β€” Founder & CEO, Quantisage

Building the AI Operating System for Scope 3 emissions management and supply chain decarbonization.


πŸŽ“ Education MBA, Kellogg School of Management, Northwestern University
🏭 Experience 20+ years across manufacturing, life sciences, energy & public sector
🌍 Scope Supply chain operations on five continents
πŸ“ Research Peer-reviewed publications on AI in sustainable supply chains
πŸ”¬ Patents IoT and AI solutions for manufacturing and logistics

LinkedIn GitHub Google Scholar Quantisage


πŸ“„ License

MIT License β€” see LICENSE for details.


Quantisage Supply Chain Climate

Part of the Quantisage Open Source Initiative | AI Γ— Supply Chain Γ— Climate

About

Graph Neural Network framework for mapping and predicting Scope 3 supply chain emissions using supplier relationship graphs, GNN-based attribution models, and automated carbon footprint tracing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages