May 24, 2026 4 min read

floodwatch-ph: two-track flood mapping for the Philippines

civic-tech
satellite
case-study

A flood is an event, not a fixture. A single high-resolution snapshot cannot capture it, and the typhoon that causes the flood is exactly the cloud that blinds optical sensors. So floodwatch-ph is two tracks, with two jobs, reported as two separate metrics, never averaged.

Track A is the observed extent. Sentinel-1 C-band SAR (the VH polarization) sees through cloud. The pipeline pulls COPERNICUS/S1_GRD for an event window, builds a dry pre-event baseline, applies Otsu thresholding to the event image, gates on “got darker than the dry baseline”, removes permanent water (JRC Global Surface Water occurrence at 50% or higher plus MERIT Hydro water cells), removes steep slope, and applies a HAND-style flood-plausible-terrain mask. The last mask is the one that suppresses the well-known Philippine rice-agriculture SAR false positive. The output per acquisition date is a vector flood polygon with permanent water removed, so the layer is flood, not hydrography.

Track B is the recurrence model. Google AlphaEarth Foundations Satellite Embedding V1 (2017, 64 dimensions, CC-BY-4.0) is the frozen substrate. A scikit-learn logistic regression head sits on top. Platt sigmoid calibration converts scores into probabilities. The training data is the Global Flood Database’s Philippine record (57 events, 2002 to 2017), split on whole events, not pixels: 40 events for training, 17 events held out. Any point that flooded in both a train and a holdout event is dropped. This is the single most important honesty decision in the project. Random-pixel splits inflate every metric because adjacent pixels are near-duplicates.

The numbers, with footnotes

Track B, event-disjoint holdout, 2,200 positive and 1,346 negative sample points: precision 0.949, recall 0.962, F1 0.955, AUC 0.974, Brier 0.046 at the deployed threshold. The negative class is “land that never flooded across any GFD event”, an easier contrast than hard negatives near floodplains; that is stated in the model card so the number is not over-read.

Track A, validated against the Global Flood Database flooded polygon for GFD event 4300 (Tropical Storm Koppu / Lando, 2015-10-22, central Luzon): IoU 0.054, precision 0.084, recall 0.053, F1 0.065. Reported plainly because it is honest, not flattering. The only Sentinel-1 acquisition for that event is several days after GFD onset, and a single 10 m SAR pass against a multi-day 250 m optical reference sees different water. Low pixel agreement is the expected, documented limitation of this comparison, not a hidden one. The Carina 2024 demo event peaks at 184 km² of detected flood across 4 real Sentinel-1 acquisition dates.

The recurrence classifier is bit-exact reproducible from the committed model/embeddings/floodwatch_embeddings_v1.npz (~1 MB, in git). make train && make hash-verify asserts sha256 prefix b7c702532f92c43f. No network, no GPU, about 30 seconds on a laptop.

The civic layer

The reason any of this matters is the gap. v1.4 cross-referenced 36,711 DPWH flood-control projects (₱1.74 trillion in allocated value, from the BetterGovPH CC0 dataset) against Track B’s recurrence layer and Track A’s observed extent, aggregate-only by province. v1.5 added four read paths over the same set: dated observed water at recorded project locations (bounded to Carina 2024, the only dated observed extent), province-level confidence honesty (the share of allocation resolved only by province-text fallback because the project carried no usable coordinate), curated and individually cited COA cross-references (the row count is the signal; per-project tagging requires a confident description match), and coarse Sentinel-1 built-change corroboration at the recorded coordinate.

A reported gap is not a finding of project failure. The recorded coordinate itself may be wrong. The published surface is province-level. Named-project detail is available only by direct id lookup. Every analytics surface carries the public-records disclaimer.

What it is not

Not a real-time warning system and not a forecast. Sentinel-1 has a roughly 6-day revisit (Sentinel-1A plus Sentinel-1C, restored around May 2025) with about 24 hours of product latency. Every observed-pass layer and slider frame is a real past acquisition labeled with its as-of date. The rainfall layer (GPM IMERG over the modeled-prone areas) is dated context, not a prediction. The /lookup returns dated evidence and never says an area will or will not flood. For warnings and live conditions, use PAGASA, MMDA Flood Control, and your LGU DRRM office.

Not a substitute for the official hazard maps. UP NOAH, Project NOAH, PAGASA, MGB, and Google Flood Hub are the authoritative instruments. floodwatch-ph is an independent, reproducible observation of where water actually was, and a model of where it recurs. It is complementary, never a replacement.

Not address-level. Exposure is aggregated to province. No household geometry. No per-dwelling flood status. No PII.

All data sourced from public satellite archives and public records. floodwatch-ph computes statistical and observational indicators only. Specific allegations, if any, require independent investigation and corroboration.

Repo: github.com/xmpuspus/floodwatch-ph. DOI: 10.5281/zenodo.20218731.