IGraSS: Iterative Graph-constrained Semantic Segmentation (📍 IJCAI 2025 )
This repository contains an implementation of an iterative segmentation pipeline for extracting irrigation canal networks from satellite imagery.
At a high level, the pipeline repeats three stages:
- Generate patch datasets from large
.npyimagery and mask arrays. - Train a segmentation model (ResUNet, DeepLabV3+, or ResNet50-UNet).
- Refine ground-truth masks using graph reachability constraints, then feed refined labels into the next iteration.
The orchestration entry point is Framework/run_framework.py.
Framework/
├── run_framework.py # End-to-end iterative pipeline CLI
├── generate_data.py # Patch extraction and dataset generation
├── resunet_train.py # Model training routine
├── run_single_process.py # Prediction + graph-based refinement logic
├── ResUNet.py # ResUNet model definition
├── deeplabModelV3.py # DeepLabV3+ model definition
├── network_utils.py # Graph / connectivity utility functions
├── path_utils.py # Data handling helpers (patching, dilation, loaders)
├── utils.py # Dataset + preprocessing utilities for training
└── refined_metrcis.py # Custom segmentation metrics
run_framework.py loops for --iterations and executes behavior based on --process_type:
d: generate training patches only.t: train model only.f: full pipeline (generate patches, train, refine GT).
In full mode (f):
- A patch dataset is generated via
generate_dataset_set(...). - A model is trained via
run_resunet(...). - The generated predictions are used by
gen_refine_gt(...)(fromrun_single_process.py) to create a refined ground truth. - The next iteration uses that refined GT and previous model weights as initialization.
The current code expects NumPy arrays on disk and relies on filename conventions.
train_img_2020.npytrain_img_2021.npytrain_img_2022.npytrain_img_2023.npy
By default the code reads masks like:
train_mask_2020.npy…train_mask_2023.npy
The refinement and generation modules currently include several hard-coded absolute paths under /scratch/... used in the original research environment. You should update those for your machine (see Porting notes).
Recommended: Python 3.10+ in a virtual environment.
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install numpy scipy matplotlib opencv-python patchify numba networkx pillow
pip install tensorflow keras
pip install segmentation-modelsDepending on your TensorFlow/Keras version, you may need version pinning for compatibility.
python Framework/run_framework.py \
--process_type f \
--iterations 5 \
--model_type resunet \
--output_path /path/to/output/ \
--from_scratch \
--dilation \
--k 4 \
--R 150 \
--th 0.5 \
--r_th 0.1 \
--epochs 10python Framework/run_framework.py \
--process_type d \
--output_path /path/to/output/python Framework/run_framework.py \
--process_type t \
--model_type resunet \
--data_dir /path/to/data/root \
--img_folder_name <generated_image_folder> \
--mask_folder_name <generated_mask_folder> \
--output_path /path/to/output/ \
--epochs 10 \
--from_scratchpython Framework/run_framework.py -h--process_type:d,t, orf.--iterations: number of refinement cycles.--output_path: root folder for generated patches, checkpoints, logs, and refinement artifacts.
--image_path: folder with yearly image.npyarrays.--gt_path: optional path to a GT.npy; if omitted, defaults are used.--patch_size: patch size (default512).--dilation: enable GT dilation pre-processing.--k: dilation kernel size.
--model_type:resunet,deeplabv3+, orresnet.--data_dir: data root used by dataset loaders.--batch_size,--learning_rate,--epochs,--optimizer.--from_scratch: train new model; if omitted,--pretrained_weightsis required.
--R: radius for source-terminal pairing.--th: prediction threshold.--r_th: source-terminal connectivity threshold.
Under --output_path, the pipeline writes:
- Generated patch folders (
*_images_512,*_masks_512). - Model checkpoints in
models/. - Training CSV logs in
logs/. - Intermediate
.npygraph/refinement artifacts. - Iteration summary CSV files.
The codebase was developed in a fixed filesystem layout and still contains hard-coded paths (for example under /scratch/gza5dr/...). To run this project elsewhere, you will likely need to:
- Replace hard-coded data paths in:
Framework/generate_data.pyFramework/resunet_train.pyFramework/run_single_process.pyFramework/configure_p.py
- Verify filename conventions for image/mask arrays.
- Ensure generated test/train split folders match what
DatasetHandlerexpects.
- This repository appears research-oriented and may require environment-specific adaptation before first successful run.
- Some defaults in CLI arguments reference machine-specific paths and should be overridden.
- There is no pinned
requirements.txtyet; dependency resolution may vary by platform.
If you find this work useful, please cite:
@inproceedings{ijcai2025p1076,
title = {IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation},
author = {Hoque, Oishee Bintey and Adiga, Abhijin and Adiga, Aniruddha and Chaudhary, Siddharth and Marathe, Madhav V. and Ravi, S.S. and Rajagopalan, Kirti and Wilson, Amanda and Swarup, Samarth},
booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25)},
pages = {9683--9691},
year = {2025},
doi = {10.24963/ijcai.2025/1076},
url = {https://doi.org/10.24963/ijcai.2025/1076}
}