Unsupervised anomaly localization with high-resolution and time-series satellite imagery: from global disaster dataset to zero-shot model via SAM(ISPRS2026)
AnomalyCD is a two-stage anomaly change detection framework for multi-temporal remote sensing imagery. It first performs bitemporal change detection with SAM (Segment Anything Model), then distinguishes normal changes from anomalous changes using historical normal observations.
This repository provides a self-contained implementation for inference and evaluation. Core dependencies (SAM and GeoTIFF I/O utilities) are bundled locally, so no external project paths need to be configured.
AnomalyCD follows a two-stage pipeline:
| Stage | Script | Description |
|---|---|---|
| Stage 1 | run_stage1.py |
Detect pixel-level changes between the latest normal image and the anomaly image |
| Stage 2 | run_stage2.py |
Identify anomalous changes using all normal temporal images and Stage 1 outputs |
| Evaluation | run_eval.py |
Per-event recall, precision, and F1 for Stage 1 and Stage 2 |
Stage 1 — Candidate Change Detection
- Feed the latest normal image and the anomaly image into SAM.
- Generate object masks and image-encoder feature maps for both images.
- Compare features within SAM segments and produce a continuous change map.
Stage 2 — Anomaly Identification
- Load the Stage 1 change map and all normal temporal images together with the anomaly image.
- Extract SAM features for each temporal observation over changed regions.
- Measure temporal feature dissimilarity relative to the anomaly image and produce the final AnomalyCD map.
The GDL (Global Disaster Land surface) dataset can be requested through the following form:
Submit your name, organization, and email to receive download instructions.
The dataset covers normal control sites and anomaly events across six major categories worldwide:
| ID | Category | Color in map |
|---|---|---|
| 0 | Normal | Black |
| 1 | Explosion | Red |
| 2 | Collapse | Orange |
| 3 | Landslide | Green |
| 4 | Fire | Yellow |
| 5 | Dam break | Light blue |
| 6 | Others | Grey |
Event folder names are prefixed with the category ID (e.g., 0_ for normal, 1_ for explosion).
After downloading and extracting the dataset, each event is stored in its own folder with multi-temporal GeoTIFF files sorted by filename.
Anomaly event example:
data/
└── 1_Equatorial_Guinea_Explosion-20210313_30/
├── anomaly_20200913.tif # anomaly image (term_names[0])
├── anomaly_20200913_label.tif # pixel-wise annotation (term_names[1])
├── normal_20040712.tif # normal temporal image
├── normal_20140913.tif
└── normal_20170913.tif # latest normal image (term_names[-1])
Normal control event example (ID 0):
data/
└── 0_Egypt_30/
├── normal_20180622.tif
├── normal_20181222.tif
├── normal_20190622.tif
└── normal_20201222.tif # latest image (term_names[-1])
Normal events contain only normal_* images. Stage 1 compares the last two temporal images; Stage 2 uses the full time series.
| Prefix | Meaning |
|---|---|
anomaly_* |
Anomaly-time remote sensing image |
*_label* |
Pixel-wise annotation raster |
normal_* |
Historical normal-time images |
Event folder prefix 0_ |
Normal control event (category ID 0, no label file) |
Event folder prefix 1–6 |
Anomaly event category (see table above) |
Inference vs. evaluation: Stage 1 and Stage 2 process both anomaly and normal events. Only
run_eval.pyskips normal control events (0_), because they have no binary anomaly labels for recall / precision / F1.
The label raster is not a binary map. Each pixel value encodes a semantic category:
0: background- Different positive integer values: different anomalous land-cover / object categories within the scene
Multiple anomaly instances or land-cover types in the same event may therefore carry different label IDs in a single annotation file.
Note on evaluation: For TPR / precision / F1 evaluation, the code converts labels to a binary anomaly mask via
binarize_label()(value1is treated as background; all other positive values are merged into the anomaly class). This follows the original evaluation protocol and does not change the multi-class nature of the raw annotations.
Create a Python 3.7+ environment and install the required packages:
cd AnomalyCD
pip install -r requirements.txtOr activate the pre-configured RSAD conda environment:
conda activate RSAD
pip install -r requirements.txtCore dependencies:
| Package | Version (tested) | Purpose |
|---|---|---|
numpy |
1.21.6 | Array operations |
torch / torchvision |
1.13.0 / 0.14.0 | SAM inference (CUDA) |
opencv-python |
4.8.0.76 | Image processing |
scikit-learn |
1.0.2 | Evaluation metrics |
scikit-image |
0.19.2 | Morphological post-processing |
GDAL |
3.0.2 | GeoTIFF read/write |
tqdm |
4.65.2 | Progress bars |
Note: PyTorch with CUDA and GDAL are recommended to install via conda when
pip installfails:conda install pytorch==1.13.0 torchvision==0.14.0 pytorch-cuda=11.7 -c pytorch -c nvidia conda install -c conda-forge gdal=3.0.2 pip install -r requirements.txt
Download the SAM ViT-B weights and set the path in config.py:
SAM_CHECKPOINT = "/path/to/sam_vit_b_01ec64.pth"Update the default paths in config.py:
DEFAULT_DATA_ROOT = "/path/to/GDL/data"
DEFAULT_OUTPUT_ROOT = "/path/to/output/AnomalyCD"AnomalyCD/
├── README.md
├── config.py # global configuration
├── img_io.py # GeoTIFF read/write (GDAL)
├── segment_anything_origin/ # bundled SAM library
├── utils.py # shared utilities
├── stage1_analysis.py # Stage 1 core algorithm
├── stage2_analysis.py # Stage 2 core algorithm
├── eval_metrics.py # evaluation metrics
├── run_stage1.py # Stage 1 entry point
├── run_stage2.py # Stage 2 entry point
├── run.py # full pipeline entry point
├── run_eval.py # evaluation entry point
├── quick_test.py # single-patch quick test
├── requirements.txt # Python dependencies
└── docs/
└── images/
├── framework.png
└── dataset_distribution.png
cd AnomalyCD
conda activate RSADVerify SAM weights, CUDA, and data paths before full inference:
python quick_test.pyExpected output: QUICK TEST PASSED.
# Run Stage 1 and Stage 2 sequentially on all events
python run.py --stage all --device cuda:0
# Stage 1 only
python run.py --stage 1 --device cuda:0
# Stage 2 only (requires Stage 1 outputs)
python run.py --stage 2 --device cuda:0
# Single event
python run.py --stage all --device cuda:0 --event "1_Equatorial_Guinea_Explosion-20210313_30"Stage 1 — Change Detection
python run_stage1.py --device cuda:0
python run_stage1.py --device cuda:0 --no-skip-existing
python run_stage1.py --device cuda:0 --event "1_Equatorial_Guinea_Explosion-20210313_30"Stage 2 — Anomaly Change Detection
python run_stage2.py --device cuda:0
python run_stage2.py --device cuda:0 --event "1_Equatorial_Guinea_Explosion-20210313_30"After Stage 2 finishes, run the official evaluation:
python run_eval.pyEach event prints Stage 1 and Stage 2 recall, precision, and F1:
1_xxx_event_name
Stage1 recall=0.8500 precision=0.7200 F1=0.7800
Stage2 recall=0.9100 precision=0.8000 F1=0.8520
Save per-event records to CSV:
python run_eval.py --output-csv eval_records.csv| Argument | Default | Description |
|---|---|---|
--data-root |
config.DEFAULT_DATA_ROOT |
Dataset root directory |
--output-root |
config.STAGE1_OUTPUT_DIR |
Stage 1 output directory |
--device |
cuda:0 |
GPU device |
--patch-size |
2048 |
Sliding-window patch size |
--event |
all events | Process a single event folder |
--no-skip-existing |
— | Re-run events that already have outputs |
| Argument | Default | Description |
|---|---|---|
--data-root |
config.DEFAULT_DATA_ROOT |
Dataset root directory |
--stage1-root |
config.STAGE1_OUTPUT_DIR |
Stage 1 output directory |
--output-root |
config.STAGE2_OUTPUT_DIR |
Stage 2 output directory |
--device |
cuda:0 |
GPU device |
--event |
all events | Process a single event folder |
--skip-existing |
False |
Skip events with existing AnomalyCD_map.tif |
| Argument | Default | Description |
|---|---|---|
--data-root |
config.DEFAULT_DATA_ROOT |
Dataset root (for reading labels) |
--result-root |
config.STAGE2_OUTPUT_DIR |
Stage 2 result directory |
--max-events |
80 |
Maximum number of events to evaluate |
--quantile-threshold |
0.945 |
Quantile threshold for Stage 1 binarization |
--stage2-threshold |
0.08 |
Fixed threshold for Stage 2 AnomalyCD binarization |
--area-ratio |
0.0003 |
Minimum connected-component area ratio |
--background-weight |
0.1 |
Background false-positive weight in precision |
--normalize-anomaly-map |
False |
Apply min-max normalization before thresholding |
--output-csv |
— | Save per-event evaluation records |
Directory: {STAGE1_OUTPUT_DIR}/{event_name}/
| File | Description |
|---|---|
change_map_continuous.tif |
Continuous change map |
Directory: {STAGE2_OUTPUT_DIR}/{event_name}/
| File | Description |
|---|---|
change_map_filtered.tif |
Filtered change map (quantile 0.7) |
AnomalyCD_map.tif |
Continuous anomaly change detection map |
AnomalyCD_map.png |
Binarized preview using evaluation settings (fixed threshold 0.08 + morphology) |
run_eval.py prints per-event metrics only:
- Stage 1: recall, precision, F1 on
change_map_filtered.tif - Stage 2: recall, precision, F1 on
AnomalyCD_map.tif
Key settings in config.py:
| Parameter | Stage | Default | Description |
|---|---|---|---|
STAGE1_PATCH_SIZE |
1 | 2048 | Sliding-window patch size |
STAGE1_CHANGE_QUANTILE |
1 | 0.9 | Change-map binarization quantile |
STAGE1_SAM_PARAMS |
1 | see config | SAM automatic mask generator settings |
STAGE2_CHANGE_MAP_QUANTILE |
2 | 0.7 | Change-map filtering quantile |
STAGE2_MIN_MASK_PIXELS |
2 | 1500 | Minimum mask pixel count |
STAGE2_SAM_PARAMS |
2 | see config | SAM automatic mask generator settings |
EVAL_QUANTILE_THRESH |
Eval | 0.945 | Stage 1 evaluation binarization quantile |
EVAL_STAGE2_FIXED_THRESH |
Eval | 0.08 | Stage 2 AnomalyCD fixed binarization threshold |
EVAL_AREA_RATIO |
Eval | 0.0003 | Small-region filtering area ratio |
EVAL_BACKGROUND_WEIGHT |
Eval | 0.1 | Weighted precision background weight |
Stage 2 automatically selects the patch size: 1024 when the short side is below 2048, otherwise 2048.
The evaluation follows the protocol in the original Eval/compare_tpr_recall_F1_.py:
- Change map (Stage 1): binarized by quantile threshold (
0.945); no morphological post-processing - AnomalyCD map (Stage 2): binarized by fixed threshold (
0.08), then morphologically closed and filtered by small connected components - Recall (TPR): fraction of anomaly pixels correctly detected
- Precision: weighted precision with background false positives down-weighted by 0.1
- F1: harmonic mean of recall and precision
- Normal control events (
0_prefix) are skipped during evaluation only
- GPU required for Stage 1 and Stage 2 inference.
run_eval.pyruns on CPU only. - Execution order: Stage 1 → Stage 2 → Evaluation.
- Stage 1 skips existing outputs by default. Use
--no-skip-existingto force re-inference. - Runtime: full inference on high-resolution images (e.g., 9840×13184) can take a long time. Use
quick_test.pyfirst, then debug with--event. - Legacy code remains in the parent repository:
- Stage 1:
SAM_Change_Detection/run.py - Stage 2:
SAM_Anomaly_Change/run.py - Evaluation:
Eval/compare_tpr_recall_F1_.py
- Stage 1:
# 1. Activate environment
conda activate RSAD
cd AnomalyCD
# 2. Edit data paths and SAM checkpoint in config.py
# 3. Verify setup
python quick_test.py
# 4. Debug on a single event
python run.py --stage all --device cuda:0 --event "1_Equatorial_Guinea_Explosion-20210313_30"
# 5. Full inference
python run.py --stage all --device cuda:0
# 6. Evaluation
python run_eval.py --output-csv eval_records.csv
