Graph Neural Network framework for mapping, attributing, and predicting Scope 3 GHG emissions across multi-tier supply chains.
scope3-gnn-mapper is an open-source Python framework that applies Graph Neural Networks (GNNs) to the Scope 3 supply chain emissions mapping challenge.
Traditional Scope 3 accounting relies on spend-based estimates, supplier surveys, and static IO tables β methods that introduce systematic errors of 40β70% and fail to capture the dynamic, multi-tier nature of modern supply chains.
This framework addresses these limitations by:
- Representing supplier relationships as a heterogeneous graph where nodes are companies/facilities and edges are procurement flows
- Using Graph Attention Networks (GAT) and GraphSAGE to propagate emission intensities through the supply chain graph
- Enabling uncertainty quantification at each node via Monte Carlo Dropout or Bayesian GNN layers
Providing automated hotspot detection to identify high-emission upstream tiers
- Supporting what-if scenario analysis for supplier switching and decarbonization roadmaps
Multi-tier supplier graph showing emission intensity propagation across Tier 1β4 suppliers. Node size = annual spend; node color = kgCO2e/USD intensity.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β SCOPE 3 GNN MAPPER β SYSTEM ARCHITECTURE β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£ β β β [Raw Supply Chain Data] [Emission Factor DBs] [IoT Sensors] β β CSV / ERP / REST API EXIOBASE / ECOINVENT MQTT/REST β β β β β β β ββββββββββββββββββββββββ΄ββββββββββββββββββββββ β β β β β βββββββββββββββΌβββββββββββββ β β β Graph Construction Layer β β β β SupplyChainGraphBuilder β β β β β’ Node: Company/Facility β β β β β’ Edge: Procurement Flow β β β β β’ HeteroData (PyG) β β β βββββββββββββββ¬βββββββββββββ β β β β β ββββββββββββββββββββββββββΌβββββββββββββββββββ β β β β β β β ββββββββΌββββββββ ββββββββββββββΌβββββββββββ ββββββΌββββββββββ β β β GAT Mapper β β GraphSAGE (Inductive) β β Bayesian GNN β β β β 8 HeadsΓ4L β β Multi-hop Aggregation β β Uncertainty β β β ββββββββ¬ββββββββ ββββββββββββββ¬βββββββββββ ββββββ¬ββββββββββ β β ββββββββββββββββββββββββββ΄βββββββββββββββββββ β β β β β βββββββββββββββΌβββββββββββββ β β β Emission Attribution β β β β β’ Scope 3 Cat 1β15 β β β β β’ Tier attribution β β β β β’ Confidence intervals β β β βββββββββββββββ¬βββββββββββββ β β β β β βββββββββββββββββββββββββββΌββββββββββββββββββββββ β β β β β β β βββββββΌβββββββ ββββββββββββΌβββββββββ ββββββββββΌβββββββ β β β Hotspot β β Scenario Analysis β β Dashboard β β β β Detection β β Supplier Switchingβ β Streamlit + β β β β & Ranking β β Decarbonization β β FastAPI β β β ββββββββββββββ βββββββββββββββββββββ βββββββββββββββββ β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Scope 3 emissions represent on average 70β90% of total corporate GHG footprint, yet remain the most poorly measured category in enterprise sustainability reporting.
Challenge Traditional Approach GNN Solution Data Gaps Spend-based estimates Β±40% error Graph imputation from neighboring nodes Multi-Tier Blindness Tier 1 coverage only N-tier propagation through graph Static Data Annual surveys (12-month lag) Real-time IoT + procurement feeds Attribution Errors Manual category mapping Automated GHG Protocol classification Uncertainty Point estimates only Full posterior distributions per supplier Scalability 50β100 suppliers max 100,000+ nodes via mini-batch training "A supplier network is a graph. Emission factors flow through procurement edges. GNNs were built to reason over exactly this structure."
Phase 1 β Graph Construction
Raw supplier CSV and procurement flow data is ingested, normalized, and assembled into a PyTorch Geometric
HeteroDataobject. Each supplier becomes a node with a 96-dimensional feature vector encoding industry codes, country of origin, revenue band, and matched emission factor database entries. Each procurement relationship becomes a directed edge with spend, volume, and commodity metadata.Phase 2 β GNN-Based Emission Attribution
The
GATEmissionMapperruns 4 attention layers with 8 heads each, progressively aggregating neighborhood information to estimate per-supplier emission intensities for all 15 GHG Protocol Scope 3 categories. Bayesian dropout layers provide posterior uncertainty estimates without requiring full Bayesian inference.Phase 3 β Hotspot Detection and Scenario Analysis
An R-Value scoring algorithm ranks suppliers by reduction potential per unit effort. The Scenario Engine models supplier switching, technology adoption, and procurement changes to generate science-aligned decarbonization roadmaps with quantified abatement potential.
Requirement Version Python 3.9+ CUDA (optional) 11.8+ RAM 16 GB minimum (32 GB recommended) Disk Space 5 GB (emission factor databases) # Clone the repository git clone https://github.com/virbahu/scope3-gnn-mapper.git cd scope3-gnn-mapper # Create virtual environment python -m venv .venv source .venv/bin/activate # Linux/Mac # .venv\Scripts\activate # Windows # Install with all dependencies pip install -e ".[dev]" # Download emission factor databases python scripts/download_emission_factors.py --databases exiobase3,ecoinvent38from scope3_gnn.data.graph_builder import SupplyChainGraphBuilder from scope3_gnn.models.gat_mapper import GATEmissionMapper from scope3_gnn.training.trainer import Scope3Trainer from scope3_gnn.analysis.hotspot import HotspotDetector # 1. Build supply chain graph builder = SupplyChainGraphBuilder(emission_factor_db="exiobase3") graph = builder.from_csv( suppliers_path="data/raw/suppliers.csv", flows_path="data/raw/flows.csv" ) print(f"Graph: {graph.num_nodes} nodes, {graph.num_edges} edges") # 2. Train GNN model model = GATEmissionMapper( in_channels=96, hidden_channels=256, num_layers=4, heads=8 ) trainer = Scope3Trainer(model=model, graph=graph, epochs=200) trainer.fit() # 3. Attribution and hotspot analysis results = model.attribute_emissions(graph) detector = HotspotDetector(top_k=20) hotspots = detector.detect(results, graph) hotspots.plot_sankey()SCOPE 3 EMISSION ATTRIBUTION REPORT β 2026 ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ Category tCO2e/year % Total Confidence ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ Cat 1: Purchased Goods 45,823 34.2% Β±6.1% Cat 4: Upstream Transport 12,104 9.0% Β±8.3% Cat 11: Use of Sold Prods 38,912 29.0% Β±11.2% Cat 12: End-of-Life 14,332 10.7% Β±9.4% Other Categories (11) 23,057 17.2% Β±14.1% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ TOTAL SCOPE 3 134,228 100.0% Β±7.8% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ Graph Coverage: 847 direct suppliers | Model Accuracy: 91.3%
[tool.poetry.dependencies] python = "^3.9" torch = "^2.0" torch-geometric = "^2.3" torch-scatter = "*" torch-sparse = "*" networkx = "^3.1" neo4j = "^5.0" pandas = "^2.0" numpy = "^1.24" scikit-learn = "^1.3" fastapi = "^0.104" streamlit = "^1.28" plotly = "^5.17" pydantic = "^2.0"
Database Coverage Frequency EXIOBASE 3 44 countries, 163 sectors Annual ecoinvent 3.8 16,000+ processes Annual GHG Protocol 300+ factors Quarterly EPA EEIO US industry IO tables Annual GLEC Framework Transport factors Bi-annual
![]()
Virbahu Jain β Founder & CEO, Quantisage
Building the AI Operating System for Scope 3 emissions management and supply chain decarbonization.
| π Education | MBA, Kellogg School of Management, Northwestern University |
| π Experience | 20+ years across manufacturing, life sciences, energy & public sector |
| π Scope | Supply chain operations on five continents |
| π Research | Peer-reviewed publications on AI in sustainable supply chains |
| π¬ Patents | IoT and AI solutions for manufacturing and logistics |
MIT License β see LICENSE for details.
