GitHub - eemeleems/GEOL0069_Project_FloodDetection: This project uses SENTINEL-1/2 machine learning for flood extent and depth estimation, with cross-scene generalisation and explainable AI, to detect extent and depth of flooding in South Yorkshire (Fishlake and Bentley/Toll Bar), during flood events of 2019 and 2021.

Flood Extent Mapping and Depth Estimation from Sentinel-1/2

SAR change detection, supervised generalisation testing, and explainable GP depth regression for the November 2019 South Yorkshire floods.
GEOL0069 (AI4EO) - Final Project | UCL Earth Sciences
Explore the project description »

Watch the video walkthrough · View the notebooks · Environmental cost

Table of Contents

About The Project
- Research Questions
- Built With
Repository Structure
Notebooks Overview
Methodology
Key Results
Getting Started
- Prerequisites
- Installation
Environmental Cost
Project Video
References
Contact
Acknowledgments
License

About The Project

In November 2019, an exceptional meteorological event dropped a month's worth of rainfall over South Yorkshire in 24 hours. The River Don breached its banks, severely flooding the village of Fishlake and damaging approximately 1,600 properties in the region.

This repository presents a machine learning pipeline addressing two central challenges in satellite-based disaster response:

Flood Extent Detection: Optimising microwave backscatter properties from Sentinel-1 Synthetic Aperture Radar (SAR).
Flood Depth Estimation: Developing topographically driven regressions via data fusion of SAR, Sentinel-2 multispectral optical indices, and Digital Elevation Models (DEM).

Rather than validating models on a single scene, we address three targeted locations to isolate genuine spatial and temporal generalisation from simple pixel memorisation:

Training Scene: Fishlake, November 2019 (Peak flood event).
Spatial Test Scene: Bentley / Toll Bar, November 2019 (Same storm event, distinct floodplain characteristics).
Temporal Test Scene: Fishlake, January 2021 (Storm Christoph – same geographic coordinate, different environmental preconditions and flood boundaries).

Infographic: Overview of the Sentinel-1 SAR instrument and application to flood-mapping.

Research Questions

Do machine learning classifiers (Random Forest, SVM, CNN) trained on a change-detection backscatter threshold baseline extract underlying physical indicators that generalise across space and time, or do they simply mirror the empirical rule they were given?

Does the integration of post-flood optical water-colour indices provide complementary insights for flood depth estimation over a flat floodplain compared to standard terrain-derived proxies alone? Which spectral features are most informative?

Built With

Core Platforms: Google Earth Engine & geemap
Sensors: Sentinel-1 (SAR) & Sentinel-2 (MSI) via the Copernicus programme
Data & Terrain: Copernicus 30 m Global DEM
Machine Learning: scikit-learn (Random Forest, SVM, Gaussian Process Regression with ARD)
Deep Learning: TensorFlow / Keras (Patch-based 2D Convolutional Neural Network)
Explainable AI (XAI) & Tracking: SHAP & CodeCarbon

(back to top)

Repository Structure

GEOL0069_Project_FloodDetection/
├── README.md                     <- Primary landing page & project overview
├── PROJECT_DESCRIPTION.md        <- Introduction to the problem
├── ENVIRONMENTAL_COST.md         <- Emissions tracking & analysis
├── Sentinel1_INFOGRAPHIC.png
├── LICENSE
├── Project_Notebooks/
│   ├── Flood_Notebook1_DataAcquisition.ipynb     <- SAR/DEM/S2 acquisition + threshold baseline
│   ├── Flood_Notebook2_Classification.ipynb      <- RF / SVM / CNN + generalisation tests
│   └── Flood_Notebook3_Regression_XAI.ipynb      <- depth proxy, GP regression, ARD, SHAP
└── images/
    ├── LOGO_README.png
    ├── Notebook1_3Scenes_ThresholdBaseline.png
    ├── Notebook1_FishlakeSentinel1.png
    ├── Notebook1_FishlakeThresholdBaseline.png
    ├── Notebook2_BentleyTollBar_3Models.png
    ├── Notebook2_ModelComparisonIoU.png
    ├── Notebook3_FloodUncertainty+Depth.png
    ├── Notebook3_GP_ARD.png
    ├── Notebook3_Kmeans.png
    ├── Notebook3_NDWI+MNDWI.png
    └── Notebook3_SHAPbeeswarm.png

(back to top)

Notebooks Overview

Notebook	Content
1. Data Acquisition	Fetches Sentinel-1 SAR (pre/mid-flood), Copernicus DEM, and Sentinel-2 optical imagery via Earth Engine for all three scenes. Computes an independent SAR change-detection Threshold Baseline flood map for each scene, used throughout the project as a reference rather than ground truth.
2. Classification	Splits the training scene into a confident core and an ambiguous margin to avoid label circularity, then trains Random Forest, SVM, and a CNN on the confident core. Evaluates all three against the Threshold Baseline on the ambiguous margin (in-scene), the spatial test scene, and the temporal test scene.
3. Regression & XAI	Defines a DEM-based relative depth proxy, computes Sentinel-2 water-colour indices (NDWI, MNDWI, Stumpf ratio), and compares three Gaussian Process regression approaches (SAR+terrain, optical, combined) using ARD lengthscales for interpretation. Cross-checks with K-means clustering and SHAP, and discusses the project's environmental footprint.

(back to top)

Methodology

Phase 1: Threshold Baseline (Notebook 1)

Every comparison made in this project is compared against a single SAR change-detection rule: backscatter that drops by more than 3 dB between the pre-flood and mid-flood Sentinel-1 acquisitions, and falls below -17 dB in absolute terms, is flagged as flooded. This Threshold Baseline is the fixed reference point against which every downstream model in Notebooks 2 and 3 is tested.

Figure 1: Pre-flood vs. mid-flood Sentinel-1 backscatter intensity over the study area, the difference between the two, and the Copernicus DEM for reference.

Figure 2: The resulting Threshold Baseline flood extent, used as the reference point throughout the project.

Phase 2: Classification and Generalisation Testing (Notebook 2)

Figure 3: The Threshold Baseline flood extent for all of the Scenes used: Fishlake 2019, Bentley/Toll Bar 2019, and Fishlake 2021.

Training a classifier directly on the Threshold Baseline's own output would guarantee near-perfect agreement by construction, since the labels and the inputs come from the same rule. To avoid this, the training scene is split into a Confident Core (ΔVV < -4.5 dB, confidently flooded, or ΔVV > -1.5 dB, confidently dry) used for training, and an Ambiguous Margin (everything in between) held out for evaluation. Random Forest, SVM (RBF kernel), and a patch-based 2D CNN are trained only on the Confident Core, then tested on the Ambiguous Margin and on two fully independent scenes: a different location hit by the same storm (Bentley/Toll Bar), and the same location during a different storm fourteen months later (Fishlake, January 2021).

Random Forest, SVM, and CNN classification outputs

Figure 7: Predicted flood extent from Random Forest, SVM, and the CNN for Bentley/Toll Bar, Nov 2019.

Figure 9: An overall model comparison of 'Intersection over Union' (our accuracy measurement) for Margin pixels, Spatial Testing (Bentley/Toll Bar), and Temporal Testing (Fishlake Jan 2021).

Phase 3: Depth Proxy and Explainable Regression (Notebook 3)

SAR backscatter provides floodwater locations, but no depth information. With no obvious LiDAR or gauge data available for this floodplain, depth is approximated with a DEM-derived proxy in the spirit of Height Above Nearest Drainage (Rennó et al., 2008): the maximum elevation along the flood's edge, minus each pixel's own elevation. This is compared against the post-flood Sentinel-2 scene's water-colour indices - NDWI, MNDWI, and the Stumpf ratio, computed from the green, NIR, and SWIR bands. Three Gaussian Process regression models, each using a kernel with Automatic Relevance Determination (ARD), are fitted on SAR+terrain features alone, optical features alone, and the two combined. The resulting feature lengthscales are cross-checked against SHAP values from an independent Random Forest regressor.

(Part of) Figure 10: NDWI and MNDWI derived from the post-flood Sentinel-2 scene, used as regression features.

Figure 16: Global SHAP feature attribution for the Random Forest depth regressor, cross-checked against the GP/ARD lengthscales.

(back to top)

Key Results

Classification generalisation

Model	Margin IoU (in-scene)	Spatial IoU (Bentley)	Temporal IoU (Jan 2021)
Random Forest	0.980	0.994	0.974
Support Vector Machine	0.963	0.981	0.952
Patch-based 2D CNN	0.547	0.679	0.421

Random Forest generalised most consistently across all three evaluation axes, consistent with learning something close to the underlying SAR threshold rule rather than scene-specific texture. The CNN generalised worst and notably dropped further on the temporal test than the spatial one, suggesting some reliance on acquisition-specific SAR texture rather than a fully transferable flood signal.

Depth regression

Feature configuration	RMSE (m)	R²
SAR + terrain	0.141	0.173
Optical only	0.165	0.112
Combined (SAR + terrain + optical)	0.136	0.186

Combining SAR/terrain and optical features gave the best held-out performance. This provided a slight improvement over either feature set alone, but this improvement was limited by how flat the floodplain is relative to the resolution of the depth proxy.

Explainability

ARD lengthscales and SHAP partially agree on which features matter (elevation and water-colour indices both feature prominently) but disagree on the specific ranking, illustrating that two reasonable XAI methods applied to related models don't always produce the same pattern of results.

(back to top)

Getting Started

Prerequisites

A Google account with Earth Engine access enabled (required for Notebook 1's data acquisition).
Python 3.10+ if running locally, or just a Google account if running in Google Colab (recommended - this is how the notebooks were developed and tested).

Installation

If running in Google Colab, only the packages not already in the Colab runtime need installing - this is handled by the !pip install cells at the top of each notebook:

pip install geemap shap codecarbon -q

If running locally, install everything from requirements.txt:

git clone https://github.com/eemeleems/GEOL0069_Project_FloodDetection.git
cd GEOL0069_Project_FloodDetection
pip install -r requirements.txt

Each notebook is self-contained and should be run in order (1 → 2 → 3), since Notebooks 2 and 3 load the feature stack saved by the previous notebook.

(back to top)

Environmental Cost

Every model trained in Notebooks 2 and 3 is tracked with CodeCarbon, and the resulting energy and CO2 figures are discussed in context - alongside the UK grid carbon intensity and the broader footprint of AI/data-centre electricity demand - in ENVIRONMENTAL_COST.md.

(back to top)

Project Video

Click to watch the video walkthrough on YouTube.

(back to top)

References

Doncaster Council (2020), Section 19 Flood Investigation Report: November 2019 Flood Event. Flood Risk Management Team. doncaster.gov.uk/services/emergencies/flood-recovery-report
GEOL0069: AI for Earth Observations, Module Content, University College London. AI4EO Github - CPOM.
International Energy Agency (IEA) (2024), Energy demand from AI: Tracking global data centre electricity trends. iea.org/reports/energy-and-ai/energy-demand-from-ai
Lundberg, S.M. and Lee, S.-I. (2017), A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765-4774. proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
McFeeters, S.K. (1996), The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. International Journal of Remote Sensing, 17(7), 1425-1432. doi.org/10.1080/01431169608948714
Rasmussen, C.E. and Williams, C.K.I. (2006), Gaussian Processes for Machine Learning. MIT Press. gaussianprocess.org/gpml
Rennó, C.D., Nobre, A.D., Cuartas, L.A., Soares, J.V., Hodnett, M.G., Tomasella, J. and Waterloo, M.W. (2008), HAND, a new terrain descriptor using SRTM-DEM: Mapping terra-firme rainforest environments in Amazonia. Remote Sensing of Environment, 112(9), 3469-3481. doi.org/10.1016/j.rse.2008.03.018
Roberts, D.R., Bahn, V., Ciuti, S., Boyce, M.S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J.J., Schröder, B., Thuiller, W., Warton, D.I., Wintle, B.A., Hartig, F. and Dormann, C.F. (2017), Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8), 913-929. doi.org/10.1111/ecog.02881
Sefton, C., Muchan, K., Parry, S., Matthews, B., Barker, L., Turner, S. and Hannaford, J. (2021), The 2019/2020 floods in the UK: a hydrological appraisal. Weather, 76, 378-384. doi.org/10.1002/wea.3993
Stumpf, R.P., Holderied, K. and Sinclair, M. (2003), Determination of water depth with high-resolution satellite imagery over variable bottom types. Limnology and Oceanography, 48(1), 547-556. doi.org/10.4319/lo.2003.48.1_part_2.0547
Tupas, M.E., Roth, F., Bauer-Marschallinger, B. and Wagner, W. (2023), An intercomparison of Sentinel-1 based change detection algorithms for flood mapping. Remote Sensing, 15(5), 1200. doi.org/10.3390/rs15051200
UK Department for Energy Security and Net Zero (DESNZ) (2024), Greenhouse gas reporting: conversion factors 2024. gov.uk/government/publications/greenhouse-gas-reporting-conversion-factors-2024
UN-SPIDER Knowledge Portal, Recommended Practice: Flood Mapping and Damage Assessment Using Sentinel-1 SAR Data in Google Earth Engine. un-spider.org/.../recommended-practice-google-earth-engine-flood-mapping/step-by-step
Xu, H. (2006), Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. International Journal of Remote Sensing, 27(14), 3025-3033. doi.org/10.1080/01431160600589179

(back to top)

Contact

Emily Grace Adams - LinkedIn - emily.adams.25@ucl.ac.uk

Project Link: https://github.com/eemeleems/GEOL0069_Project_FloodDetection

(back to top)

Acknowledgments

This project is the final assignment for GEOL0069 Artificial Intelligence for Earth Observation (25/26) at University College London.
Thank you to Prof. Michel Tsamados, Weibin Chen and Shambu Bhandari Sharma for the GEOL0069 module content and guidance this project builds on.
Thank you to ESA/Copernicus for the availability of Sentinel-1 and Sentinel-2 data, and to Google Earth Engine for the processing platform.
Best-README-Template, on which this README's structure is based.

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flood Extent Mapping and Depth Estimation from Sentinel-1/2

About The Project

Research Questions

Built With

Repository Structure

Notebooks Overview

Methodology

Phase 1: Threshold Baseline (Notebook 1)

Phase 2: Classification and Generalisation Testing (Notebook 2)

Phase 3: Depth Proxy and Explainable Regression (Notebook 3)

Key Results

Getting Started

Prerequisites

Installation

Environmental Cost

Project Video

References

Contact

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Project_Notebooks		Project_Notebooks
images		images
ENVIRONMENTAL_COST.md		ENVIRONMENTAL_COST.md
LICENSE		LICENSE
PROJECT_DESCRIPTION.md		PROJECT_DESCRIPTION.md
README.md		README.md
Sentinel1_INFOGRAPHIC.png		Sentinel1_INFOGRAPHIC.png
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Flood Extent Mapping and Depth Estimation from Sentinel-1/2

About The Project

Research Questions

Built With

Repository Structure

Notebooks Overview

Methodology

Phase 1: Threshold Baseline (Notebook 1)

Phase 2: Classification and Generalisation Testing (Notebook 2)

Phase 3: Depth Proxy and Explainable Regression (Notebook 3)

Key Results

Getting Started

Prerequisites

Installation

Environmental Cost

Project Video

References

Contact

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages