GitHub - CityMind-Lab/GeoHG: ACM SIGSPATIAL 2025-Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs

GeoHG: Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs

English | 中文

A modular, pip-installable Python toolkit that uses heterogeneous graph neural networks to infer socioeconomic indicators (Carbon, GDP, Population, PM2.5, Night Light) from spatial data. Published at ACM SIGSPATIAL 2025.

TL;DR — Turn your geographic raster/vector data into a heterogeneous graph, then let message-passing GNNs learn multi-scale spatial relationships that traditional geostatistics cannot capture. Three lines of Python to go from scatter points to grid predictions.

from geohg import GeoHGInterpolator
interp, result = GeoHGInterpolator.from_dataframe(df, coord_columns=["lon","lat"], target_column="value", resolution=0.01)
# result["predictions"] -> (n_lat, n_lon) grid array

@inproceedings{zou2024space,
  title={Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs},
  author={Zou, Xingchen and Huang, Jiani and Hao, Xixuan and Yang, Yuhao and Wen, Haomin and Yan, Yibo and Huang, Chao and Chen, Chao and Liang, Yuxuan},
  booktitle={The 33rd ACM International Conference on Advances in Geographic Information Systems},
  year={2025}
}

Why Graphs for Geography?

"Everything is related to everything else, but near things are more related than distant things." — Tobler's First Law of Geography

Traditional spatial methods (IDW, Kriging, spatial regression) operationalize Tobler's Law through distance-based weights — the closer two locations are, the more influence they have on each other. This works well but misses a crucial insight:

Spatial relationships are richer than just proximity.

Two forests 50 km apart may behave more similarly than a forest and a factory next door
Commercial zones across a city share economic patterns regardless of distance
POI distributions (restaurants, offices, parks) reveal functional urban structure that pure distance ignores

GeoHG captures all these relationships in a single heterogeneous graph:

           ┌─────────────────────────────────────────────────────┐
           │         The GeoHG Heterogeneous Graph               │
           │                                                     │
           │    [Forest]         [Urban]         [Water]         │
           │    Entity           Entity          Entity          │
           │   ╱  │  ╲          ╱  │  ╲         ╱    ╲          │
           │  ╱   │   ╲        ╱   │   ╲       ╱      ╲         │
           │ a₁   a₅   a₈    a₂   a₄   a₇   a₃      a₆        │
           │  ╲   │   ╱ ╲    ╱    │   ╱                         │
           │   ╲  │  ╱   ╲  ╱     │  ╱    ← spatial adjacency   │
           │    ╲ │ ╱     ╲╱      │ ╱       (8-neighbor grid)   │
           │     a₉──────a₁₀─────a₁₁                           │
           │                                                     │
           │  ▪ area nodes   = grid cells (land cover features) │
           │  ▪ entity nodes = land cover categories (hypernode) │
           │  ▪ poi nodes    = POI categories (hypernode)        │
           │  ▪ edges        = near / locate / rev_locate        │
           └─────────────────────────────────────────────────────┘

Message passing on this graph lets information flow through both spatial proximity AND semantic similarity — something traditional geostatistics cannot do. Each GNN layer aggregates features from neighbors, progressively building richer representations that encode:

Local context — what does this cell's neighborhood look like? (spatial adjacency)
Global patterns — how do all areas of the same land cover type behave? (entity hypernodes)
Functional similarity — what urban functions are present? (POI hypernodes)

GeoHG vs. Traditional Spatial Methods

Method	Spatial Relationships	Feature Utilization	Scalability	Multi-task
IDW	Distance-only	None	High	No
Kriging	Variogram (distance)	Limited (co-kriging)	Medium	No
Spatial Regression (GWR)	Distance-weighted	Linear	Medium	No
Random Forest	None (tabular)	All features	High	No
GeoHG (Ours)	Multi-relational graph	All features + structure	High	Yes

Highlights

Spatial Interpolation	Flexible Data Input	Heterogeneous GNN	SSL Pretraining
Kriging-style interpolation via `GeoHGInterpolator` — from scatter points to grid predictions in 3 lines	Bring your own CSV/DataFrame, ESA WorldCover TIF, or use built-in sample data for 4 cities x 5 indicators	Dynamic node type inference — entity/POI counts detected from data, not hardcoded	Contrastive pretraining with graph-structure and feature-similarity neighbors

Benchmark Results

Results on built-in Guangzhou dataset (8,540 grid cells, 70% masked as test set, seed=0):

Task	Metric	Train	Validation	Note
Carbon	R²	0.865	0.860	2000 epochs, early-stop at ~170
GDP	R²	—	—	Run `geohg train --task GDP`
Population	R²	—	—	Run `geohg train --task Population`
PM2.5	R²	—	—	Run `geohg train --task PM25`
Night Light	R²	—	—	Run `geohg train --task Light`

Reproduce with: geohg train --config configs/examples/guangzhou_carbon.yaml All 4 cities (GZ/BJ/SH/SZ) x 5 indicators are available. Run your own benchmarks!

1. Installation

conda create -n geohg python=3.10 -y
conda activate geohg

git clone <your-repo-url>
cd GeoHG

# Standard install
pip install -e .

# With TIF/rasterio support (for building graphs from ESA WorldCover)
pip install -e ".[geo]"

# With development tools
pip install -e ".[dev]"

Verify installation:

geohg --version
python -c "from geohg import GeoHGPipeline; print('OK')"

2. Quick Start

30-Second Demo

from geohg import GeoHGPipeline

# One line: load built-in Guangzhou data, train, evaluate
results = GeoHGPipeline.quick_start(city="GZ", task="Carbon")
print(f"Test R²: {results['r2']:.4f}")

Spatial Interpolation (3 lines)

from geohg import GeoHGInterpolator

interp, result = GeoHGInterpolator.from_dataframe(
    df, coord_columns=["lon", "lat"],
    target_column="value", resolution=0.01
)
# result["predictions"] is a (n_lat, n_lon) grid array

See Section 3 for details and notebooks/quickstart_interpolation.ipynb for a full tutorial.

CLI (recommended for experiments)

# Train a model
geohg train --config configs/examples/guangzhou_carbon.yaml

# End-to-end: build graph + train + evaluate
geohg run --config configs/examples/guangzhou_carbon.yaml

Python API

from geohg import GeoHGPipeline

pipeline = GeoHGPipeline.from_yaml("configs/examples/guangzhou_carbon.yaml")
results = pipeline.run()

3. Spatial Interpolation

GeoHG can perform spatial interpolation similar to Kriging, but using heterogeneous graph neural networks on discrete grids.

Key difference from Kriging: GeoHG divides space into regular grid cells (user-specified resolution, e.g., 0.01 deg ~ 1km), constructs a heterogeneous graph over the grid, and predicts values for each cell — rather than continuous interpolation.

Scatter points (N, 2)          GeoHG Pipeline
--------------------------     ---------------------------
coords + features + values     1. Grid discretization (resolution)
      |                        2. 8-neighbor adjacency + entity hyper-nodes
      +-->  UserDataSource --> 3. HeteroGraphBuilder -> HeteroData
                               4. Train GeoHGModel (observed -> train/val)
                               5. Predict all grid cells -> (n_lat, n_lon) array

Python API

from geohg import GeoHGInterpolator
import numpy as np

coords = np.column_stack([lons, lats])   # (N, 2)
features = np.column_stack([f1, f2, f3]) # (N, F)

# One-liner from DataFrame
interp, result = GeoHGInterpolator.from_dataframe(
    df, coord_columns=["lon", "lat"],
    target_column="value", resolution=0.01
)

# Or step by step
interp = GeoHGInterpolator(resolution=0.01, epochs=1000)
interp.fit(coords, features, values)
result = interp.predict()  # {"predictions": (n_lat, n_lon), "lons": ..., "lats": ...}

CLI

geohg interpolate \
  --data my_data.csv \
  --coord-columns lon,lat \
  --target-column target \
  --resolution 0.01 \
  --output predictions.csv \
  --epochs 1000

Notebook Tutorial

See notebooks/quickstart_interpolation.ipynb for a complete tutorial including:

Synthetic data interpolation
Real-world case: Guangzhou carbon emission interpolation with 8,540 grid cells

4. Core Concepts: From Geographic Space to Graph

This section explains the key ideas behind GeoHG for GIS practitioners.

Step 1: Grid Discretization

Like rasterization in GIS, GeoHG divides the study area into regular grid cells. Each cell becomes an area node in the graph, with land cover ratios as its feature vector.

Continuous geographic space          Discrete grid (area nodes)
┌────────────────────┐               ┌───┬───┬───┬───┐
│  ~  forest  ~      │               │ a₀│ a₁│ a₂│ a₃│  Each cell:
│    ┌──urban──┐     │   ────────>   ├───┼───┼───┼───┤  - land cover ratios
│    │ ■■■■■■■ │     │   grid at     │ a₄│ a₅│ a₆│ a₇│  - position encoding
│    └─────────┘     │   0.01° res   ├───┼───┼───┼───┤  - (optional POI counts)
│  ~ water ~~~~~     │               │ a₈│ a₉│a₁₀│a₁₁│
└────────────────────┘               └───┴───┴───┴───┘

Step 2: Spatial Adjacency (Tobler's Law)

Each grid cell is connected to its 8 neighbors (queen contiguity), forming (area, near, area) edges. This encodes Tobler's First Law — near things are more related.

┌───┬───┬───┐
│ ↖ │ ↑ │ ↗ │    8-neighbor connectivity
├───┼───┼───┤    = Queen contiguity in GIS
│ ← │ ● │ → │    = (area, near, area) edges
├───┼───┼───┤
│ ↙ │ ↓ │ ↘ │
└───┴───┴───┘

Step 3: Entity Hypernodes (Beyond Distance)

Here is where GeoHG goes beyond traditional spatial methods. Each land cover category (forest, urban, water, ...) becomes an entity hypernode that connects to all grid cells containing that category. This creates shortcuts in the graph:

                [Forest Entity]
               ╱       │       ╲
         a₀(70%)   a₅(40%)   a₁₁(90%)    ← cells with forest cover
              \        |        /
               ╲       │       ╱
                [Urban Entity]
               ╱       │       ╲
         a₃(80%)   a₆(55%)   a₇(60%)     ← cells with urban cover

Why this matters: A forest cell in the north and a forest cell in the south — even if far apart — can exchange information through the shared Forest entity node. This captures landscape-level patterns that distance-based methods miss entirely.

Step 4: POI Hypernodes (Urban Function)

Similarly, POI categories (restaurants, offices, parks, hospitals, ...) become hypernode types. Grid cells are connected to POI nodes based on their POI distributions, capturing functional urban structure.

Step 5: Heterogeneous Message Passing

The complete graph has 3 node types and 5 edge types:

Node Type	Count (Guangzhou)	Features
`area`	8,540	Land cover ratios + position encoding
`entity`	9	Identity (one-hot)
`poi`	14	Identity (one-hot)

Edge Type	Count (Guangzhou)	Meaning
`(area, near, area)`	67,172	Spatial adjacency
`(entity, locate, area)`	24,902	Entity covers area
`(area, rev_locate, entity)`	24,902	Reverse of above
`(poi, locate, area)`	9,957	POI exists in area
`(area, rev_locate, poi)`	9,957	Reverse of above

A GNN with to_hetero() conversion learns separate message functions for each edge type, then aggregates them — like having specialized spatial analysis for each type of geographic relationship, all learned end-to-end.

5. Application Scenarios

GeoHG is designed for any task where you need to infer a spatially distributed socioeconomic indicator from land cover and/or POI data:

Scenario	Target Variable	Input Features	Example
Carbon Emission Mapping	CO₂ emissions per grid cell	Land cover ratios, POI density	Urban carbon inventory
Economic Activity Estimation	GDP per grid cell	Land cover, commercial POI	Regional economic assessment
Population Distribution	Population density	Built-up area ratio, residential POI	Census disaggregation
Air Quality Prediction	PM2.5 concentration	Vegetation ratio, industrial land	Environmental monitoring
Night Light Estimation	Luminosity index	Urban land ratio, commercial POI	Urbanization tracking
Custom Indicator	Your own target	Your CSV/DataFrame features	`GeoHGInterpolator.from_dataframe()`

When to use GeoHG over traditional methods:

You have multi-dimensional features (not just coordinates)
Your study area has heterogeneous land cover or diverse urban functions
You want to leverage structural similarity between distant but functionally similar areas
You need predictions at grid-cell resolution across the entire study area

6. Features

Spatial interpolation — Kriging-style interpolation via GeoHGInterpolator (scatter points -> grid predictions)
End-to-end pipeline — from raw data to trained model in one command or a few lines of Python
Modular architecture — configurable GNN encoder, MLP head, and training components
Dynamic graph construction — entity/POI node types inferred from data, not hardcoded
Multiple data sources — custom CSV/DataFrame, ESA WorldCover TIF, or built-in sample data
POI optional — works with or without POI data (hypernode: entity or mono)
Self-supervised pretraining — contrastive GNN pretraining with neighbor sampling
YAML configuration — three-level priority: defaults < user config < CLI overrides
CPU/GPU support — automatic fallback to CPU when CUDA is unavailable

7. Project Structure

GeoHG/
├── pyproject.toml                    # Package definition & dependencies
├── configs/
│   ├── default.yaml                  # Global default configuration
│   └── examples/
│       ├── guangzhou_carbon.yaml     # Guangzhou carbon emission example
│       ├── no_poi.yaml              # Training without POI data
│       └── ssl_pretrain.yaml        # Self-supervised pretraining
├── geohg/                            # Main package
│   ├── config/                       # Dataclass config schema + YAML loader
│   ├── data/
│   │   ├── sources/                  # Data source abstractions
│   │   │   ├── base.py              # LandCoverSource ABC + GraphRawData
│   │   │   ├── esa_worldcover.py    # ESA TIF processing (optional GDAL)
│   │   │   ├── custom_landcover.py  # User-provided CSV
│   │   │   └── poi.py              # Optional POI augmentation
│   │   ├── builders/                 # Graph construction
│   │   │   ├── graph.py             # HeteroGraphBuilder -> PyG HeteroData
│   │   │   ├── adjacency.py         # 8-neighbor grid adjacency
│   │   │   ├── grid.py              # TIF grid utilities
│   │   │   └── features.py          # Pixel-to-ratio feature extraction
│   │   ├── legacy.py                # Backward-compatible loader
│   │   └── transforms.py            # TargetNormalizer + train/val/test split
│   ├── models/
│   │   ├── gnn.py                   # GNNEncoder (configurable layers/type)
│   │   ├── heads.py                 # MLPHead (configurable dims)
│   │   ├── geohg.py                 # GeoHGModel = GNN + to_hetero + Head
│   │   └── ssl.py                   # Contrastive loss + SSLNeighborDataset
│   ├── training/
│   │   ├── trainer.py               # Supervised training loop
│   │   ├── ssl_trainer.py           # SSL pretraining loop
│   │   ├── evaluator.py             # R2/RMSE/MAE metrics
│   │   └── callbacks.py             # EarlyStopping + ModelCheckpoint
│   ├── pipeline.py                   # GeoHGPipeline (end-to-end orchestration)
│   ├── interpolator.py               # GeoHGInterpolator (spatial interpolation)
│   ├── cli/                          # Click CLI commands
│   │   ├── main.py                  # geohg entry point
│   │   ├── train.py                 # geohg train
│   │   ├── pretrain.py              # geohg pretrain
│   │   ├── evaluate.py              # geohg evaluate
│   │   ├── run.py                   # geohg run (end-to-end)
│   │   ├── build.py                 # geohg build-graph
│   │   └── interpolate.py           # geohg interpolate
│   ├── visualization/                # Training & prediction plots
│   └── utils/                        # Seed, device, logging
├── data/                             # Sample data (4 cities)
│   ├── Hyper_Graph/{GZ,BJ,SH,SZ}/  # Graph & feature files
│   └── downstream_tasks/{city}/     # Task label files
├── notebooks/                        # Jupyter tutorials
│   └── quickstart_interpolation.ipynb
└── tests/                            # Test suite

8. Data Format

Custom CSV (recommended)

The easiest way to use GeoHG with your own data — provide three CSV files:

from geohg.data.sources.custom_landcover import CustomLandCoverSource

source = CustomLandCoverSource(
    feature_csv="my_features.csv",   # area_id, type1_ratio, type2_ratio, ...
    coord_csv="my_coords.csv",       # area_id, lon, lat
    adjacency_csv="my_edges.csv",    # src_id, dst_id
)
raw = source.load()

Or use GeoHGInterpolator.from_dataframe(df, ...) to feed a DataFrame directly (see Section 3).

ESA WorldCover TIF

Build graphs from GeoTIFF with automatic grid and adjacency construction (requires [geo] extra):

pip install -e ".[geo]"
geohg build-graph --tif-path ESA_WorldCover.tif --bbox 113.09 113.69 22.40 23.42 --output-dir data/my_city

Built-in sample data

The repository ships with sample data for 4 cities (GZ/BJ/SH/SZ) x 5 indicators under data/, all in CSV format. Use them directly with GeoHGPipeline.quick_start(city="GZ", task="Carbon").

Graph files under data/Hyper_Graph/{city}/:

File	Columns	Description
`adjacency.csv`	`src_id, dst_id`	Area adjacency edges
`entity_area.csv`	`entity_id, area_id, proportion`	Entity-locate-area relations
`poi_area.csv`	`poi_id, area_id, proportion`	POI-locate-area relations
`pos_encode.csv`	`area_id, x, y`	Grid position encodings
`TIF_feature.csv`	`File, Coordinates, Area, Value_*_Ratio`	Land cover feature ratios
`POI_feature.csv`	`TIF, POI_0, ..., POI_13, Count`	POI feature ratios

Task labels under data/downstream_tasks/{city}/:

File	Columns	Description
`Carbon.csv`	`area_id, value`	Carbon emission
`GDP.csv`	`area_id, value`	GDP
`Population.csv`	`area_id, value`	Population
`PM25.csv`	`area_id, value`	PM2.5
`Light.csv`	`area_id, value`	Night light

Entity/POI type counts are automatically detected from the data files.

9. CLI Reference

All commands accept --config <path> for YAML configuration. CLI flags override config values.

Command	Description
`geohg train`	Supervised training
`geohg pretrain`	Self-supervised contrastive pretraining
`geohg evaluate --model-path <path>`	Evaluate a saved model
`geohg run`	End-to-end pipeline (build + train + evaluate)
`geohg build-graph`	Build graph from TIF (requires `[geo]` extra)
`geohg interpolate`	Spatial interpolation from CSV scatter data

Common options:

geohg train --config configs/examples/guangzhou_carbon.yaml \
  --city GZ \
  --task Carbon \
  --epochs 2000 \
  --lr 0.01 \
  --masked-ratio 0.7 \
  --metric r2 \
  --gpu 0

10. Configuration

Configuration uses a three-level priority system: configs/default.yaml < user YAML < CLI flags.

# configs/examples/guangzhou_carbon.yaml
data:
  city: GZ
  task: Carbon
  prebuilt_dir: data/Hyper_Graph
  downstream_dir: data/downstream_tasks
  pos_embedding: true
  hypernode: all            # all | entity | poi | mono
  entity_thresh: 0.0
  poi_thresh: 0.0

model:
  hidden_channels: 64
  num_gnn_layers: 3
  gnn_type: GraphConv       # GraphConv | SAGEConv
  head_dims: [32, 16]
  dropout: 0.5

training:
  epochs: 2000
  lr: 0.01
  metric: r2                # r2 | rmse | mae
  patience: 100
  masked_ratio: 0.7
  seed: 0

output:
  log_dir: outputs/logs
  model_dir: outputs/models
  plot_dir: outputs/plots

See configs/examples/ for more configuration templates.

11. Self-Supervised Pretraining

Contrastive pretraining of the GNN encoder using graph-structure and feature-similarity neighbors:

# Pretrain
geohg pretrain --config configs/examples/ssl_pretrain.yaml --city GZ

# Fine-tune with pretrained encoder
geohg train --config configs/examples/guangzhou_carbon.yaml \
  --pretrained-gnn outputs/models/ssl_GZ.pth \
  --freeze-gnn

Or via Python:

from geohg import GeoHGPipeline

pipeline = GeoHGPipeline.from_yaml("configs/examples/ssl_pretrain.yaml")
results = pipeline.run(skip_pretrain=False)

Why SSL? When labeled data is scarce (e.g., only a few ground-truth monitoring stations), self-supervised pretraining learns useful spatial representations from the graph structure alone, then fine-tuning on the small labeled set often yields better results than training from scratch.

12. Building Graphs from TIF

GeoHG can build heterogeneous graphs directly from ESA WorldCover GeoTIFF files:

pip install -e ".[geo]"

geohg build-graph \
  --tif-path path/to/ESA_WorldCover.tif \
  --bbox 113.09 113.69 22.40 23.42 \
  --output-dir data/my_city

The output files (feature CSV, adjacency, position encodings) can then be used with geohg train.

For custom data without GDAL, provide CSVs directly — see Data Format.

Supported land cover sources:

ESA WorldCover 10m — global land cover at 10m resolution
Custom GeoTIFF — any categorical raster with land cover classes
Custom CSV — pre-computed land cover ratios per grid cell

13. FAQ

Q: How is GeoHG different from Kriging / IDW?

Kriging and IDW are distance-based interpolation methods — they predict values at unknown locations using weighted averages of nearby observations, where weights depend solely on distance (and variogram in Kriging's case).

GeoHG uses a fundamentally different approach: it constructs a heterogeneous graph that encodes multiple types of spatial relationships (proximity, land cover similarity, urban function), then uses graph neural networks to learn non-linear prediction functions. This means:

GeoHG can leverage multi-dimensional features (land cover ratios, POI distributions), not just coordinates
GeoHG captures non-distance relationships (two forests far apart share an entity hypernode)
GeoHG outputs predictions at grid-cell resolution rather than continuous points

Trade-off: GeoHG requires training data and computation; Kriging is a closed-form solution. For small datasets with only coordinates, Kriging may be simpler and sufficient.

Q: Do I need a GPU?

No. GeoHG automatically falls back to CPU when CUDA is unavailable. Training on CPU is slower but fully functional. For the built-in Guangzhou dataset (~8,500 nodes), CPU training completes in a few minutes.

Q: Can I use my own data without TIF files?

Yes. You have three options:

DataFrame — GeoHGInterpolator.from_dataframe(df, ...) (easiest)
CSV files — provide feature CSV + coordinate CSV + adjacency CSV via CustomLandCoverSource
TIF — use geohg build-graph to automatically extract features (requires [geo] extra)

Option 1 is the simplest: just pass a pandas DataFrame with coordinate columns, a target column, and any additional feature columns.

Q: What grid resolution should I use?

This depends on your study area and data density:

0.01° (~1 km) — good default for city-scale analysis
0.005° (~500 m) — finer resolution, requires denser observation points
0.02° (~2 km) — coarser, works with sparser data
The built-in data uses approximately 0.006° resolution for Guangzhou

Rule of thumb: ensure you have at least 3-5 observations per grid cell on average for reliable training.

Q: How does the entity hypernode work exactly?

Each land cover category (e.g., "Tree cover", "Built-up", "Water body") becomes a hypernode. An edge connects the entity hypernode to every grid cell that contains that land cover type, weighted by the proportion of that land cover in the cell. During message passing, the entity node aggregates information from all connected grid cells, then broadcasts it back — effectively letting all "forest cells" share information regardless of distance.

Q: Can I add my own node/edge types?

The current version supports area, entity, and POI node types. To add custom node types, you would extend GraphRawData in geohg/data/sources/base.py and update HeteroGraphBuilder in geohg/data/builders/graph.py. The GNN encoder automatically adapts via PyG's to_hetero().

14. Contributing

Contributions are welcome! Here's how to get started:

# Clone and install in development mode
git clone <your-repo-url>
cd GeoHG
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run a quick training to verify
geohg train --config configs/examples/guangzhou_carbon.yaml --epochs 50

Areas where contributions are particularly welcome:

New data sources (satellite imagery, census data, mobility data)
Additional GNN architectures (GAT, GIN, etc.)
Visualization improvements
New city datasets
Documentation and tutorials

License

MIT License. See pyproject.toml for details.

Acknowledgments

GeoHG builds on:

PyTorch Geometric — the heterogeneous graph framework
ESA WorldCover — global land cover data at 10m resolution
The urban computing and GeoAI research community

If you use GeoHG in your research, please cite our SIGSPATIAL 2025 paper (see top of this README).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
configs		configs
data		data
geohg		geohg
notebooks		notebooks
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
README_CN.md		README_CN.md
banner.png		banner.png
intro.pdf		intro.pdf
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

GeoHG: Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs

Why Graphs for Geography?

GeoHG vs. Traditional Spatial Methods

Highlights

Benchmark Results

Table of Contents

1. Installation

2. Quick Start

30-Second Demo

Spatial Interpolation (3 lines)

CLI (recommended for experiments)

Python API

3. Spatial Interpolation

Python API

CLI

Notebook Tutorial

4. Core Concepts: From Geographic Space to Graph

Step 1: Grid Discretization

Step 2: Spatial Adjacency (Tobler's Law)

Step 3: Entity Hypernodes (Beyond Distance)

Step 4: POI Hypernodes (Urban Function)

Step 5: Heterogeneous Message Passing

5. Application Scenarios

6. Features

7. Project Structure

8. Data Format

Custom CSV (recommended)

ESA WorldCover TIF

Built-in sample data

9. CLI Reference

10. Configuration

11. Self-Supervised Pretraining

12. Building Graphs from TIF

13. FAQ

14. Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages