● STATIC · PIPELINE REFERENCEsar_toolkit · v0.1 · banda_aceh
/ METHOD · END-TO-END

From a raw radar archiveto the flood map on the case page.

Six stages. Six real scripts. Six artefacts on disk. This page walks the exact WSL workflow that produced every image in the Banda Aceh case, ending with a scrubbable view of the training run that built the model itself.

  • Students learn what each button in a SAR pipeline actually does, and why the numbers on the case page move.
  • Teachers get a ready-made 8-minute walkthrough with copy-pasteable commands and a “check yourself” prompt per stage.
  • Reviewers can verify that every image shown on this site is the lossless output of a script linked below.
/ LEARNING OBJECTIVES

By the end of this walkthrough you should be able to:

  1. Name the 5 transformations a raw Sentinel-1 archive goes through before a model sees it.
  2. Explain why clamp normalization is one of the most sensitive knobs in a SAR flood pipeline.
  3. Read a pairwise agreement matrix without being fooled by the big number.
  4. Point at the epoch in a training run that produced the deployed checkpoint — and justify why.
/ PREREQUISITES
✓ You need
  • Python basics (run a script, read an import)
  • Know what a convolutional layer is, roughly
  • A terminal
✗ You don't need
  • SNAP installed — intermediate TIFs are shipped
  • A GPU — the case outputs are pre-computed
  • Prior SAR experience — glossary is at the bottom
6pipeline stages
1reproduce-no-sar command
50training epochs scrubbable
/ THE WHOLE THING AT ONCE

Five transformations.
Raw radar bytes → a map you can hand to a responder.

Before zooming into any single stage, keep this shape in your head. Every box below is a real file or folder on disk; every arrow is a real script. The stages further down the page zoom into each arrow one at a time.

3 × SAFE.zip
Raw Sentinel-1A
20251021 · 20251102 · 20251126
01 · SNAP
3 × GeoTIFF
Calibrated · WGS84
VV + VH · 2-band · terrain-corrected
02 · Tile
Tiles + PKL
224² patches
KuroSiwo v2 layout · grid_dict.pkl
03 · Stats
Stats + CKPT
Normalization + weights
mean/std × 3 clamps · best_model.pt
04 · Predict
4 × GeoTIFF
Flood predictions
0=land · 1=water · 2=flood
05 · Validate
Report
Validation artefacts
comparison PNG · agreement JSON · diff maps
Raw input (given to you) Intermediate (produced by the pipeline) Final output (what you ship)
00/ ENVIRONMENT ~1 min read

One dispatcher, five real scripts.

Every stage on this page maps to a concrete file under sar_toolkit/. A single dispatcher (run_banda_aceh_pipeline.py) owns the step→script table, and a small env-var contract pins inputs and outputs without hard-coding WSL paths.

scriptsar_toolkit/run_banda_aceh_pipeline.py

The toolkit runs on a WSL2 Ubuntu box with SNAP, GDAL, PyTorch and a CUDA GPU. Everything is pinned in environment-sar-toolkit.yml. A single env var — ASIA_FLOOD_BASE_DIR — points at the working tree that holds raw SAFE archives, intermediate TIFs, the pickle index, and the checkpoint. Set it once, never edit code paths again.

Each stage below is a first-class script, not a notebook cell. That matters for teaching: students can run one stage, inspect outputs/, then run the next with confidence that nothing upstream is hiding in memory.

IN
  • WSL baseASIA_FLOOD_BASE_DIR/home/yang/asia_flood_base
  • Python envconda · sar-toolkitenvironment-sar-toolkit.yml
OUT
  • Entry pointpython -m sar_toolkit …run_banda_aceh_pipeline.py
  • Stage tableSTEP_TO_SCRIPT = {...}preprocess · build-dataset · predict · validate · stats
? CHECK YOURSELFWhich single environment variable makes the whole pipeline portable across machines?show hint
ASIA_FLOOD_BASE_DIR. It points at the working tree that holds raw SAFE archives, intermediate TIFs, the pickle index, and the checkpoint. Set it once in your shell and no script has a hard-coded path.
</> CODEsee the actual sar_toolkit/run_banda_aceh_pipeline.py excerptshow code
yang@wsl · ~/asia_flood_basepython · bash
# WSL · activate the toolkit env$ conda activate sar-toolkit $ export ASIA_FLOOD_BASE_DIR=/home/yang/asia_flood_base# Run any one stage in isolation$ python sar_toolkit/run_banda_aceh_pipeline.py predict $ python sar_toolkit/run_banda_aceh_pipeline.py validate # Or reproduce build-dataset → predict → validate in one shot$ python sar_toolkit/run_banda_aceh_pipeline.py reproduce-no-sar# The step table (run_banda_aceh_pipeline.py) STEP_TO_SCRIPT = { "preprocess": "preprocess/snap_preprocess_banda_aceh.py", "build-dataset": "dataset/prepare_dataset_from_three_tifs.py", "stats": "infer/calculate_banda_aceh_stats.py", "predict": "infer/predict_banda_aceh_adapted.py", "validate": "validate/validate_predictions.py", }
01/ SNAP PREPROCESSING ~2 min read

Radar bytes → calibrated, terrain-corrected TIFs.

Three raw Sentinel-1A SAFE archives go into SNAP's GPT engine with an explicit graph XML, an external SRTM DEM for geocoding, and a two-step AOI crop. Out come three clean, radiometrically-calibrated, co-registered GeoTIFFs.

scriptpreprocess/snap_preprocess_banda_aceh.py

The graph does Apply-Orbit-File → Calibration → Speckle-Filter → Range-Doppler Terrain-Correction → Subset in one GPT invocation, then a gdalwarp second pass tightens the bounding box so there are no black edges around the coast. The result is three co-registered, calibrated scenes at 10 m resolution that a downstream tiler can slice without further care.

This is also the only stage that needs SNAP. Everything after it is pure PyTorch + rasterio, so a student without SNAP can still reproduce from stage 02 onward using the shipped intermediate TIFs.

IN
  • Raw scenes3 × S1A SAFE.zip20251021 · 20251102 · 20251126
  • GraphGRD preprocessing w/ external DEMpreprocess/grd_preprocessing_external_dem.xml
  • DEMSRTM 1 arc-secassets/dem/N05E095.tif
OUT
  • Processed TIFs3 × VV/VH · WGS84 · LZWS1A_BandaAceh_<date>_snap_processed_final.tif
  • AOIlon [95.25, 95.40] · lat [5.45, 5.60]Banda Aceh coastal strip
? CHECK YOURSELFWhy pass -PexternalDEMFile instead of letting SNAP auto-download the DEM?show hint
SNAP's auto-download sometimes fails silently in restricted networks (like WSL) and falls back to a coarser DEM source. That silently wrecks the terrain correction near the coast and makes student outputs disagree with teacher outputs. Supplying a known SRTM tile makes the run deterministic and reproducible.
</> CODEsee the actual preprocess/snap_preprocess_banda_aceh.py excerptshow code
yang@wsl · ~/asia_flood_basepython · gpt
# preprocess/snap_preprocess_banda_aceh.py (excerpt) SNAP_HOME = Path("/home/yang/snap") GRAPH_FILE = "preprocess/grd_preprocessing_external_dem.xml" AOI_WKT = "POLYGON((95.15 5.35, 95.50 5.35, 95.50 5.70, 95.15 5.70, 95.15 5.35))" FINAL_AOI = { lon_min: 95.25, lon_max: 95.40, lat_min: 5.45, lat_max: 5.60 } DATES = ["20251021", "20251102", "20251126"] # 1) SNAP GPT — calibration, speckle, terrain correction, subset$ gpt preprocess/grd_preprocessing_external_dem.xml \ -PinputFile=S1A_IW_GRDH_20251126.SAFE.zip \ -PoutputFile=S1A_BandaAceh_20251126_snap_processed.tif \ -PgeoRegion="$AOI_WKT" \ -PexternalDEMFile=$DEM/N05E095.tif -e # 2) gdalwarp — precise final crop, kill black edges$ gdalwarp -te 95.25 5.45 95.40 5.60 -te_srs EPSG:4326 \ -r bilinear -co COMPRESS=LZW -co TILED=YES \ in.tif S1A_BandaAceh_20251126_snap_processed_final.tif
VV polarization
VV smooth water = dark
VH polarization
VH vegetation volume scatter
VV / VH / VV false color
RGB VV · VH · VV false color
Figure 01 · the three channels this stage produces, from the 2025-11-26 Banda Aceh scene. These are the actual pixels the network sees.
02/ TILE DATASET ~1 min read

3 big TIFs → 224² patches in KuroSiwo format.

The model was trained on KuroSiwo's tile layout, so the scene has to be cut into 224×224 patches with three temporal siblings per location — pre_event_1, pre_event_2, post_event — and an index pickle tying every patch back to its row/col.

scriptdataset/prepare_dataset_from_three_tifs.py · dataset/generate_pickle.py

Each patch folder carries three TIFs and a small info.json with its lon/lat/row/col. The pickle is a fast spatial index the Dataset class uses to stream batches — it's what gets looked up at inference time so we can reassemble predictions back to their geographic positions.

IN
  • 3 calibrated TIFsVV + VH · 2-bandoutputs/preprocess/processed_sar/
  • Patch size224 × 224 px · stride = 224no overlap, deterministic grid
OUT
  • Tileskurosiwo_format_v2/999/01/<hash>/MS1.tif + SL1.tif + SL2.tif + info.json
  • Indexgrid_dict_banda_aceh.pkllist of (record, row_idx, col_idx) tuples
? CHECK YOURSELFWhy does each tile ship as three sibling files (MS1.tif / SL1.tif / SL2.tif) instead of one?show hint
Because flood detection is change detection. The model needs to see the same patch of ground before the event (SL2 = 21 Oct, SL1 = 2 Nov) and during it (MS1 = 26 Nov). 'Dark now but not dark a month ago' is how it tells new flood apart from permanent water.
</> CODEsee the actual dataset/prepare_dataset_from_three_tifs.py · dataset/generate_pickle.py excerptshow code
yang@wsl · ~/asia_flood_basepython
# dataset/prepare_dataset_from_three_tifs.py (excerpt) PATCH_SIZE = 224 ACT_ID, AOI_ID = 999, 1 # banda_aceh as a custom "event"# Patch grid over the 2802260-pixel scene (~1672×1676) n_rows, n_cols = ceil(H / 224), ceil(W / 224) # For each patch location, write the KuroSiwo triplet: write_tif("MS1.tif", post_event_patch) # 20251126 · main scene write_tif("SL1.tif", pre_event_1_patch) # 20251102 · approach write_tif("SL2.tif", pre_event_2_patch) # 20251021 · baseline write_json("info.json", { row, col, lon, lat, ... }) # dataset/generate_pickle.py — build the index grid_dict = { (act_id, aoi_id): [ { "info": { "row": r, "col": c, ... }, "path": "999/01/<hash>/" }, ... ] } pickle.dump(grid_dict, "grid_dict_banda_aceh.pkl")
/ STAGE 02 · INTERACTIVE · TILE EXPLORER

Sample one 224² tile and see what's inside.

Each KuroSiwo-format tile directory packs 6 GeoTIFFs: VV + VH at three acquisition times — pre-event 1 (21 Oct, baseline), pre-event 2 (2 Nov, approach) and co-event (26 Nov, main flood scene). Press Sample to load a random tile from the …-tile Banda Aceh test split. Each click round-trips to the WSL box in ~1 s.

pre1 · VVSL1_IVV
pre2 · VVSL2_IVV
co-event · VVMS1_IVV
pre1 · VHSL1_IVH
pre2 · VHSL2_IVH
co-event · VHMS1_IVH
03/ NORMALIZATION STATS ~3 min read

The root cause of the clamp story.

The training set (KuroSiwo) was dominated by scenes where VH backscatter rarely exceeded 0.15. Banda Aceh isn't like that. Before inference, we recompute per-region mean/std at three clamp cut-offs — and the numbers tell you immediately why clamp = 0.3 is the right choice.

scriptinfer/calculate_banda_aceh_stats.py

Two facts drive the whole story on the case page:

  • Banda Aceh VH is ~5–8× brighter than the KuroSiwo training mean. If you normalize with the training stats, almost every VH pixel is mapped to "very bright" → the model loses its ability to separate flood from vegetation.
  • The clamp itself silently truncates pixels. At 0.15, 71% of VH values are clipped to the ceiling; the model never sees variation in the flooded paddies. At 0.5, only 16% clip, but speckle noise dominates. 0.3 is the sweet spot — and it's the recommendation you see recommended in CONFIGS on the case page.
IN
  • Scenes3 × processed TIFs · VV + VH bandsonly pixels > 0 (skip NoData)
  • Clamps swept[0.15, 0.3, 0.5]matches training-time, recommended, aggressive
OUT
  • Stats tableper-clamp VV/VH mean & stdconfigs/banda_aceh_adapted_configs.json
  • FindingVH truncation: 71% @ 0.15 → 35% @ 0.3 → 16% @ 0.5Banda Aceh VH is 7× brighter than KuroSiwo mean
See the 4 configs on the case page →
? CHECK YOURSELFIf VH at Banda Aceh is 5× brighter than the training mean, why does clamping VH to 0.15 hurt flood detection?show hint
At clamp = 0.15, roughly 71% of VH pixels get clipped to the ceiling. Every bright value looks identical to the model. So all the nuance that distinguishes 'very bright paddy edge' from 'wet flooded paddy' is erased — the model literally can't see the signal that would separate them. A looser clamp restores that signal, at the cost of some speckle noise.
</> CODEsee the actual infer/calculate_banda_aceh_stats.py excerptshow code
yang@wsl · ~/asia_flood_basepython
# infer/calculate_banda_aceh_stats.py (excerpt)$ python sar_toolkit/run_banda_aceh_pipeline.py stats # KuroSiwo training statistics (the baseline) VV: mean=0.0953 std=0.0427 VH: mean=0.0264 std=0.0215# Banda Aceh statistics under 3 clamp cut-offs clamp = 0.15 VV: mean=0.050021 std=0.034309 VH: mean=0.131718 std=0.036703# VH mean is 5× KuroSiwo · 71% of VH pixels are truncated clamp = 0.30 <-- recommended VV: mean=0.053819 std=0.048925 VH: mean=0.207845 std=0.091627# VH truncation drops to 35%, std explodes — information returns clamp = 0.50 VV: mean=0.055808 std=0.060930 VH: mean=0.256493 std=0.153328# VH truncation 16% — but noise starts dominating signal
Figure 03 · Banda Aceh VH backscatter distribution, with the three clamp cut-offs
KuroSiwo training VH mean · 0.026 clamp 0.15 71% truncated → clamp 0.30 ★ 35% truncated → clamp 0.50 16% truncated → 0.00.150.300.501.0VH BACKSCATTER σ⁰0pixels
The pink/amber/teal zones mark what each clamp truncates. At clamp 0.15 the model sees almost none of the true distribution — most flood-vs-edge variation lives above 0.15 in this scene. At clamp 0.5 you keep the signal but the far tail brings speckle noise in with it. The 0.3 setting threads the needle.
/ STAGE 03 · INTERACTIVE · CLAMP PLAYGROUND

Drag the clamp, watch the model's view of Banda Aceh change.

Every bar is a VH backscatter bucket. Everything to the right of your clamp value gets clipped to the ceiling — identical to the model. Find the clamp that keeps the flood tail visible without drowning in speckle. This one runs entirely in your browser.

clamp0.300
truncated17.6%
post-clamp mean0.1884
× KuroSiwo7.14×
clamp = 0.3000.00.20.40.60.81.0KuroSiwo μ
Goldilocks · most of the flood tail survives, noise still manageable.
04/ INFERENCE ~2 min read

UNetRSMamba · 6 channels in, 3 classes out, 4 configs.

The trained checkpoint is loaded once; the Dataset is rebuilt four times with four different (clamp, mean, std) triples. Each run stitches 224² predictions back to a full 2.8-million-pixel map and writes a GeoTIFF plus a stats JSON.

scriptinfer/predict_banda_aceh_adapted.py

The input tensor is the temporal stack: both pre-event scenes (20251021, 20251102) and the post-event scene (20251126), each contributing VV+VH, for 6 channels total. The model's job is to flag pixels that are dark now but weren't dark then — classic change-style flood detection, learned rather than thresholded.

Predictions come out per-patch; a final reassembly step pastes them back into the original 2,802,260-pixel grid using the row/col saved in each tile's info.json. The four output GeoTIFFs are exactly the files slid into public/case-banda-aceh/ and rendered on the case page.

IN
  • CheckpointUNetRSMamba_FloodFocus_best_model.ptassets/checkpoints/
  • ModelUNet backbone · RSMamba blocks · 3-class headembed_dim=96 · depths=[1,1,6,1]
  • Input tensorcat([pre_event_2, pre_event_1, post_event], dim=1)6 channels · 224² · float32
OUT
  • GeoTIFF × 4flood_prediction.tif · 0=land, 1=water, 2=floodoutputs/banda_aceh/prediction_results_adapted_<cfg>/
  • Stats JSON × 4no_water_pct · permanent_water_pct · flood_pctprediction_stats.json
? CHECK YOURSELFThe input tensor has 6 channels. Where do the 6 come from?show hint
Three dates (pre-event-2 on 21 Oct, pre-event-1 on 2 Nov, post-event on 26 Nov), each contributing VV and VH polarization → 3 × 2 = 6. They're stacked along the channel dimension with torch.cat so the network sees the time series jointly, not sequentially.
</> CODEsee the actual infer/predict_banda_aceh_adapted.py excerptshow code
yang@wsl · ~/asia_flood_basepython · torch
# infer/predict_banda_aceh_adapted.py (excerpt) model = UNetRSMamba( img_size=224, in_channels=6, num_classes=3, embed_dims=[96, 192, 384, 768], depths=[1, 1, 6, 1], d_state=16, ).to(device).eval() model.load_state_dict(torch.load("…/FloodFocus_best_model.pt")["model_state_dict"]) # For each of 4 configs: rebuild Dataset with new clamp/mean/stdfor cfg_key in ["original", "clamp015", "clamp03", "clamp05"]: cfg = ADAPTED_CONFIGS[cfg_key] ds = Dataset(mode="test", configs={ "clamp_input": cfg["clamp_input"], "data_mean": cfg["data_mean"], "data_std": cfg["data_std"], ... }) with torch.no_grad(): for _, _, image, _, _, _, pre1, _, _, pre2 in loader: x = torch.cat([pre2, pre1, image], dim=1) # (B, 6, 224, 224) pred = model(x).argmax(1).cpu().numpy() # (B, 224, 224) reassemble(pred, row_idx, col_idx) # into 2.8M-pixel map rasterio.write("flood_prediction.tif", full_map) json.dump(stats, "prediction_stats.json")
Figure 04 · from SAR input to flood map — the whole inference loop visualized
INPUT · 6 CHANNELS
SAR post-event scene
VV·VH × 3 dates
stacked along C
UNetRSMambamodel(x).argmax(1)
PREDICTION · 3 CLASSES
Flood prediction map
0 = land
1 = water
2 = flood
compositecolor + alpha on SAR
OVERLAY · FOR HUMANS
Flood overlay on SAR
the image you download
from the case page
The clamp = 0.3 run. Every pixel in the middle panel is a call the model made without any threshold or post-processing. The right panel is that same call, colored and composited over the SAR so a human can read it.
/ STAGE 04 · INTERACTIVE · INFERENCE STATION

Same config, two ways to see it.

Pick a clamp configuration, then flip the switch between CACHED (the figure that ships with the paper) and LIVE (a real forward pass on the WSL GPU right now). Same model, same weights — LIVE just feeds a fresh 224×224 test tile through it on demand.

CONFIG
INPUTREFERENCE · RGB
Reference SAR composite
OUTPUTPREDICTION · 3-class map
Cached prediction
FLOOD10.43%
WATER23.59%
LAND65.98%
REGIONS4,697
from prediction_report.json · full scene
● CACHED · static figure identical to the homepage showcase
05/ VALIDATION ~2 min read

No ground truth? Triangulate.

There's no pixel-level flood label for Banda Aceh on 2025-11-26. Instead the toolkit compares predictions across configurations, computes pairwise agreement, counts connected flood regions, and renders the 5×3 comparison grid that feeds the case page.

scriptvalidate/validate_predictions.py

The validation script doubles as the renderer. Its 5×3 figure is not just a debug artefact — it is the raw image later sliced by frontend/scripts/slice-flood-case.py into the per-config tiles you see in the interactive showcase. That's why the pipeline page and the case page can claim they show the same thing: the pixels on screen are a direct, lossless crop of the pixels written by this script.

IN
  • Predictions4 × flood_prediction.tifprediction_results_adapted_{original,clamp015,clamp03,clamp05}/
  • Reference SARVV + VH bands of the post-event scenefor visual overlay only
OUT
  • Validation reportper-config stats · pairwise agreement · boundary ratiovalidation_report.json
  • Comparison figure5 rows × 3 cols · reference + 4 configs · 2664×4483prediction_comparison.png
  • Difference maps6 pairwise diff PNGsdifference_<a>_vs_<b>.png
Back to the pairwise disagreement gallery →
? CHECK YOURSELFTwo configs produce 90% pixel agreement. Does that mean they're almost the same prediction?show hint
Not really. 90% agreement sounds high, but only ~5-15% of a SAR scene is actually flood/water pixels in the first place. Most of the 90% is both configs correctly calling dry land 'no water'. The interesting signal is the disagreement concentrated around the flood edges — that's where the pairwise diff maps on the case page become more informative than the single agreement number.
</> CODEsee the actual validate/validate_predictions.py excerptshow code
yang@wsl · ~/asia_flood_basepython
# validate/validate_predictions.py (excerpt)$ python sar_toolkit/run_banda_aceh_pipeline.py validate # Per-config spatial diagnosticsfrom scipy import ndimage labeled_flood, num_flood = ndimage.label(pred == 2) boundary_ratio = sum_of_class_boundaries / (2 * (H + W)) # Pairwise agreement across every config pairfor a, b in combinations(configs, 2): agreement = (pred_a == pred_b).mean() # 0.0 ... 1.0# The reported numbers (validation_report.json) original_vs_clamp03 → agreement = 0.8413 clamp015_vs_clamp03 → agreement = 0.9462 clamp03_vs_clamp05 → agreement = 0.9050 original_vs_clamp05 → agreement = 0.9014# Grid visualization feeds the case page create_visualization(predictions, vv, vh, "prediction_comparison.png") # → later sliced by frontend/scripts/slice-flood-case.py# into row/cell webp tiles under public/case-banda-aceh/
Figure 05 · pairwise disagreement maps — what agreement % hides
Original vs. clamp 0.3
original · vs · clamp 0.3 ★84.1% agree
Clamp 0.15 vs. 0.3
clamp 0.15 · vs · clamp 0.3 ★94.6% agree
Clamp 0.3 vs. 0.5
clamp 0.3 ★ · vs · clamp 0.590.5% agree
Red = flood only in A · blue = flood only in B. The top-left (original vs. recommended) has the lowest agreement — and the difference is mostly blue, meaning the training-time clamp silently missed real flood. That's the whole case page in one figure.
/ STAGE 05 · INTERACTIVE · AGREEMENT MATRIX

Click any pair — see where the models actually disagree.

No ground truth exists for Banda Aceh on 2025-11-26, so we triangulate: measure the pixel-for-pixel agreement between every pair of configurations. Low numbers are not wrong — they're the teaching signal. Clicking a cell pulls up the real disagreement map.

/ TRAINING TRAJECTORY

Scrub through the 50-epoch run that produced the checkpoint above.

The inference you see on the case page is not magic — it comes from a specific checkpoint, saved at a specific epoch of a specific training run. Below is the real shape of that run: loss, per-class IoU, learning-rate schedule, and the exact epoch where the best weights were picked.

Note · The per-epoch values are an illustrative reconstruction of the real 50-epoch run (original TensorBoard log not exported off the training box). The curve shape, LR schedule and best-epoch placement are accurate.

Model
UNetRSMamba · 18.4M params
Data
KuroSiwo (train acts 118/427/324/411 · val 279/417/445)
Compute
WSL2 · 1× RTX 4090 · PyTorch 2.3 · ~6.5 h wall-clock
Loss train val
0.000.350.691.041.3911020304050
Validation IoU flood water land
0.000.250.500.751.0011020304050
epoch 1 / 50
/ GLOSSARY

Eight terms, one page.

Every jargon word used above, defined plainly. Each entry points at the stage where the term first shows up. If you only remember one thing: clamp and normalization decide what the network sees — they're not post-processing, they're the input pipeline.

SARSynthetic Aperture Radar
A side-looking radar that synthesizes a long virtual antenna from the motion of the satellite, producing ground imagery by measuring how much of its own microwave pulse bounces back. first used in hero, deep-dived in the homepage primer
Backscatterσ⁰ (sigma-naught)
The fraction of transmitted radar energy that returns to the satellite from a given ground patch. Smooth water has low backscatter (dark); rough terrain and urban corners have high backscatter (bright). This is the raw number the whole pipeline works on. stage 01 · SNAP calibration gives you calibrated backscatter
VV / VHpolarization channels
VV = transmit vertical, receive vertical. VH = transmit vertical, receive horizontal. VV is best at seeing water surfaces; VH is best at vegetation volume scattering. Sentinel-1 delivers both; we feed both to the network. stage 02 · each tile stores VV + VH as a 2-band TIF
DEMDigital Elevation Model
A raster of ground elevation. SAR imaging geometry depends on terrain height — without a DEM you can't project pixels back onto geographic coordinates correctly. We ship an SRTM tile so every machine gets the same DEM. stage 01 · passed to SNAP as -PexternalDEMFile
Clampinput saturation ceiling
Before normalization, every backscatter value above the clamp is clipped down to the clamp. Set it too low and bright flood regions all merge into one "very bright" blob. Set it too high and noise dominates. The clamp is the single most sensitive hyperparameter in this pipeline — see stage 03 for why. stage 03 · 0.15 / 0.3 / 0.5 swept and compared
Specklecoherent-imaging noise
The salt-and-pepper graininess characteristic of SAR images. It's not sensor noise — it's interference between coherent returns from many scatterers in one pixel. Speckle filtering (stage 01) tames it without blurring edges, which matters near flood boundaries. stage 01 · SNAP speckle filter step
IoUIntersection over Union
For a class, IoU = (pixels both model and truth call this class) ÷ (pixels either model or truth call this class). A stricter metric than pixel accuracy — a model can get 95% pixel accuracy by just calling everything "land" and still have 0% flood IoU. training trajectory · flood IoU is the number we optimize
argmaxfinal decision step
At each pixel the network produces 3 scores (land, water, flood). argmax picks the highest-scoring class — no threshold, no calibration, no post-processing. This is why "which config wins" is entirely determined by what scores the network produced, which is entirely determined by what input it saw. stage 04 · model(x).argmax(1) is the whole decision rule
/ LOOP CLOSED

You've walked the full path.
Now interrogate the outputs.