Plant Root Segmentation Pipeline

Instance segmentation of plant root cross-sections from fluorescence microscopy. Supports 4 species (Millet, Rice, Sorghum, Tomato), 3 microscopes (C10, Olympus, Zeiss), and 5 target classes (Whole Root, Aerenchyma, Endodermis, Vascular, Exodermis). Cereals (monocots) have aerenchyma; tomato (dicot) has exodermis instead. Models can train with 4 classes (standard) or 5 classes (with exodermis), using masked loss for missing annotations per species.

Project Structure

plants/
├── predict.py               # Inference + save predictions
├── evaluate.py              # Evaluation with GT comparison
├── analyze_downstream.py    # Downstream biological analysis
├── polygon_editor.py     # Manual annotation review/correction
├── preview_annotations.py   # Generate annotation preview images
├── visualize_augmentations.py
├── train/                   # Training scripts
│   ├── train_yolo.py
│   ├── train_unet.py
│   ├── train_sam.py
│   ├── train_cellpose.py
│   ├── run_training.py      # Sequential training runner
│   └── run_grid_training.py # Benchmark grid runner (runs 1-5)
├── src/                     # Shared library
│   ├── config.py            # Paths, class defs, defaults
│   ├── dataset.py           # SampleRegistry
│   ├── preprocessing.py     # Image loading, normalization
│   ├── annotation_utils.py  # YOLO annotation parsing
│   ├── splits.py            # Train/val/test splitting
│   ├── augmentation.py      # Albumentations transforms
│   ├── evaluation.py        # PredictionResult, converters
│   ├── metrics.py           # SegmentationMetrics
│   ├── postprocessing.py    # Post-processing pipeline
│   ├── downstream.py        # Downstream metric computation
│   ├── visualization.py     # Shared visualization utilities
│   ├── formats/             # YOLO, COCO, Mask NPZ exporters
│   └── models/              # Model-specific datasets/utils
└── data/
    ├── image/               # {Species}/{Microscope}/{Exp}/{Sample}/
    ├── annotation/          # YOLO polygon .txt files
    ├── prediction/          # Generated prediction .txt files
    └── downstream/          # Downstream analysis results

Data Directory Layout

All scripts expect a --data-dir argument pointing to a folder that contains an image/ subfolder with the TIF images. Two layouts are supported:

Structured layout (used for training data with annotations):

my_data/
├── image/
│   └── {Species}/{Microscope}/{Exp}/{Sample}/
│       ├── {Sample}_TRITC.tif
│       ├── {Sample}_FITC.tif
│       └── {Sample}_DAPI.tif
├── annotation/          # YOLO polygon .txt files
├── prediction/          # Generated by predict.py
└── downstream/          # Generated by analyze_downstream.py

Generic layout (for new/unlabeled data):

my_data/
├── image/
│   ├── sample_001/
│   │   ├── sample_001_TRITC.tif
│   │   ├── sample_001_FITC.tif
│   │   └── sample_001_DAPI.tif
│   ├── sample_002/
│   │   └── ...
│   └── ...
├── prediction/          # Generated by predict.py
└── downstream/          # Generated by analyze_downstream.py

Each sample must be in its own subfolder under image/ containing three TIF files with _TRITC, _FITC, and _DAPI suffixes.

Training

Training scripts are in the train/ folder. Each can be run directly:

# YOLO (4-class only)
python train/train_yolo.py --strategy A --num-classes 4

# U-Net++ multilabel (4-class or 5-class with masked loss)
python train/train_unet.py --mode multilabel --strategy A --num-classes 4
python train/train_unet.py --mode multilabel --strategy A --num-classes 5 --mask-missing

# SAM (4 or 5 classes)
python train/train_sam.py --strategy A --num-classes 5

# Cellpose (4 or 5 classes, per-class models)
python train/train_cellpose.py --version 3 --all-classes --strategy A --num-classes 5

Strategy A Benchmark Grid (Runs 1-5)

Run all 5 benchmark models sequentially with run_grid_training.py:

python train/run_grid_training.py                  # Run all 5 benchmark runs
python train/run_grid_training.py --only 1 3       # Run only runs #1 and #3
python train/run_grid_training.py --skip 4 5       # Skip SAM and Cellpose
python train/run_grid_training.py --epochs 5       # Quick test with 5 epochs
python train/run_grid_training.py --eval-only      # Only evaluate existing checkpoints
python train/run_grid_training.py --train-only     # Only train, skip evaluation

Run	Model	Classes	Epochs	Key Config
1	YOLO11m-seg	4	300	SGD, patience 15
2	U-Net++ multilabel	4	300	AdamW, cosine LR, patience 15
3	U-Net++ multilabel	5	300	AdamW, cosine LR, masked loss, patience 15
4	SAM vit_b	5	300	AdamW, cosine LR, frozen encoder, patience 15
5	Cellpose v3 per-class	5	150	AdamW, no early stopping

Each run trains the model and then evaluates on the test set via evaluate.py.

Annotation Class Index

Annotations are stored in YOLO polygon format (.txt files) using 6 raw classes. During training, these are converted into 4 or 5 semantic target classes by subtracting inner boundaries from outer boundaries to create ring masks.

Annotation classes (raw, in `.txt` files)

Class ID	Name	Color	Description
0	Whole Root	Blue	Outer boundary of the entire root cross-section (1 per sample)
1	Aerenchyma	Yellow	Individual air spaces in the cortex (many per sample, cereals only)
2	Outer Endodermis	Green	Outer boundary of the endodermis ring (1 per sample)
3	Inner Endodermis	Red	Inner boundary of the endodermis ring (1 per sample)
4	Outer Exodermis	Orange	Outer boundary of the exodermis ring (1 per sample, tomato only)
5	Inner Exodermis	Purple	Inner boundary of the exodermis ring (1 per sample, tomato only)

Target classes (derived, used for model training)

Target ID	Name	Derivation	Species
0	Whole Root	Direct from annotation class 0	All
1	Aerenchyma	Direct from annotation class 1	Cereals only
2	Endodermis	Ring mask: annotation cls 2 minus cls 3	All
3	Vascular	Area inside annotation class 3	All
4	Exodermis	Ring mask: annotation cls 4 minus cls 5	Tomato only

4-class mode (--num-classes 4): targets 0-3 only, exodermis ignored
5-class mode (--num-classes 5): targets 0-4, uses --mask-missing for species lacking certain classes

Important: Ring classes (endodermis, exodermis) are derived by subtracting the inner boundary polygon from the outer boundary polygon. This subtraction happens during data loading (in annotation_utils.py), not in post-processing. The post-processing fill_holes step is ring-aware — it preserves the structural central hole of ring masks while filling only small artifact holes in the ring band.

Annotation counts per species

Species	Samples	cls 0: Whole Root	cls 1: Aerenchyma	cls 2: Outer Endo	cls 3: Inner Endo	cls 4: Outer Exo	cls 5: Inner Exo	Total Polygons
Millet	110	110	418	110	110	0	0	748
Rice	588	588	13,706	588	588	0	0	15,470
Sorghum	474	474	9,218	474	474	0	0	10,640
Tomato	545	545	0	545	545	545	545	2,725
Total	1,717	1,717	23,342	1,717	1,717	545	545	29,583

Classes 0, 2, 3 have exactly 1 polygon per sample across all species
Aerenchyma (cls 1): cereals only — avg ~3.8 (Millet), ~23.3 (Rice), ~19.4 (Sorghum) per sample
Exodermis (cls 4-5): tomato only — exactly 1 polygon each per sample
Tomato has no aerenchyma; cereals have no exodermis

Strategy A Data Split

All models share the same experiment-level split (seed=42). Samples from the same experiment always stay together. Rice/Zeiss (35 samples) is excluded — reserved for deployment evaluation.

Species	Microscope	Train	Val	Test	Total
Millet	Olympus	67 (1 exp)	29 (1 exp)	14 (1 exp)	110
Rice	C10	38 (6 exps)	—	12 (4 exps)	50
Rice	Olympus	383 (12 exps)	91 (2 exps)	29 (3 exps)	503
Sorghum	C10	—	25 (1 exp)	19 (4 exps)	44
Sorghum	Olympus	366 (77 exps)	43 (9 exps)	21 (13 exps)	430
Tomato	C10	54 (1 exp)	—	11 (2 exps)	65
Tomato	Olympus	432 (15 exps)	23 (1 exp)	25 (4 exps)	480
Total		1340	211	131	1682

Configurable Training Parameters

All training scripts support common flags:

Flag	Description
`--epochs`	Max training epochs
`--batch-size`	Batch size
`--lr`	Learning rate
`--weight-decay`	Weight decay
`--patience`	Early stopping patience
`--optimizer`	Optimizer: adamw, adam, sgd
`--scheduler`	LR scheduler: cosine, step, plateau
`--num-classes`	Target classes: 4 (standard) or 5 (with exodermis)
`--mask-missing`	Enable masked loss for missing annotations (U-Net++ only)
`--save-every`	Save periodic checkpoint every N epochs
`--img-size`	Input image size (default 1024)

Scripts

1. `predict.py` — Inference + Save Predictions

Run model inference on images, save predictions as YOLO .txt files, and optionally generate visualizations.

# Run on an arbitrary folder of TIF images
python predict.py --data-dir path/to/new_images/ --checkpoint path/to/best.pt

# Skip visualization generation
python predict.py --data-dir data/ --checkpoint path/to/best.pt --no-vis

# Custom batch size and confidence threshold
python predict.py --data-dir data/ --checkpoint path/to/best.pt \
    --batch-size 32 --conf-thresh 0.3

# Skip post-processing (raw YOLO output)
python predict.py --data-dir data/ --checkpoint path/to/best.pt --no-postprocess

Arguments:

Argument	Default	Description
`--data-dir`	(required)	Directory containing an `image/` subfolder with TIF images
`--checkpoint`	(required)	YOLO model checkpoint (`.pt`)
`--img-size`	1024	Inference image size
`--conf-thresh`	0.25	Confidence threshold
`--batch-size`	16	GPU batch size
`--no-vis`	false	Skip visualization output
`--no-postprocess`	false	Disable post-processing (ring-aware hole filling, aerenchyma clipping, etc.)
`--max-dim`	800	Max dimension for visualization images

Output:

{data-dir}/prediction/*.txt — YOLO-format polygon predictions (one per sample)
{data-dir}/prediction/vis/*.png — 2-panel (Original | Prediction) overlay images (unless --no-vis)

2. `evaluate.py` — Model Evaluation

Evaluate any trained model against ground truth annotations. Supports all 4 model types: YOLO, U-Net++, SAM, and Cellpose. Computes IoU/Dice metrics, generates comparison plots, and saves visualizations.

# Evaluate YOLO model
python evaluate.py --model yolo --checkpoint path/to/best.pt \
    --strategy A --num-classes 4

# Evaluate U-Net++ (multilabel mode, 5-class)
python evaluate.py --model unet --unet-mode multilabel \
    --checkpoint path/to/best.ckpt --strategy A --num-classes 5

# Evaluate SAM
python evaluate.py --model sam --sam-type vit_b \
    --checkpoint path/to/best.pth --strategy A --num-classes 5

# Evaluate Cellpose (loads per-class models from directory)
python evaluate.py --model cellpose \
    --checkpoint path/to/cellpose_run_dir/ --strategy A --num-classes 5

# Use saved predictions instead of running inference
python evaluate.py --from-predictions data/prediction/ --strategy A

# Skip visualizations, only compute metrics
python evaluate.py --model yolo --checkpoint best.pt --no-vis

# Regenerate plots from existing metrics JSON
python evaluate.py --plot-only output/evaluation/yolo_metrics.json

Arguments:

Argument	Default	Description
`--data-dir`	`data/`	Data directory with `image/` and `annotation/` subfolders
`--model`	(required*)	Model type: `yolo`, `unet`, `sam`, `cellpose`
`--checkpoint`	(required*)	Path to model checkpoint (file or dir for Cellpose)
`--num-classes`	4	Number of target classes (4 or 5)
`--from-predictions`	—	Load saved YOLO `.txt` files (skip inference)
`--img-size`	1024	Inference image size
`--unet-mode`	`semantic`	U-Net mode: `semantic` or `multilabel`
`--sam-type`	`vit_b`	SAM model type: `vit_b`, `vit_l`, `vit_h`
`--strategy`	—	Split strategy: `A`, `B`, `C`
`--split`	`test`	Which split to evaluate: `train`, `val`, `test`
`--seed`	42	Random seed for split generation
`--no-vis`	false	Skip visualization overlay images
`--vis-dir`	auto	Custom visualization output directory
`--no-metrics`	false	Skip metric computation
`--no-plots`	false	Skip plot generation
`--plot-only`	—	Regenerate plots from existing JSON
`--enable-pp`	—	Force-enable post-processing steps
`--disable-pp`	—	Force-disable post-processing steps
`--no-postprocess`	false	Disable all post-processing

*Not required when using --plot-only or --from-predictions.

Output:

output/evaluation/{model}_metrics.json — Aggregated metrics
output/evaluation/{model}_per_sample.csv — Per-sample IoU/Dice
output/evaluation/{model}_*_comparison.{png,pdf} — Box plots by species/microscope
output/evaluation/vis_{model}/*.png — 3-panel (Original | GT | Prediction) overlays

3. Downstream Biological Analysis

Three entry points compute the same per-sample biological measurements (aerenchyma ratio + Exodermis/Endodermis/Vascular mean intensity on TRITC and FITC) from bio-7 masks. They differ only in where the masks come from:

Script	Source	Compares GT vs pred?	GPU?
`run_eval_pipeline.py`	runs eval, then both GT and preds	yes	yes
`downstream_measure_from_predictions.py`	saved YOLO `.txt` predictions + GT	yes	no
`downstream_measure_from_masks.py`	any mask directory (GT or preds)	no (single CSV)	no

All three share the same intensity-thresholding feature (described below).

Computed columns (per sample):

aerenchyma_ratio — aerenchyma area / whole root area
exodermis_TRITC, exodermis_FITC
endodermis_TRITC, endodermis_FITC
vascular_TRITC, vascular_FITC

Mean intensities are computed on the raw, unnormalized image (load_sample_raw).

`run_eval_pipeline.py` — Eval + downstream + plots

Full pipeline: runs IoU/Dice eval, saves predictions, then computes paired GT-vs-pred downstream measurements and correlation plots.

python run_eval_pipeline.py --model-key timm_semantic --checkpoint path/to/best.ckpt --run-dir path/to/run/
python run_eval_pipeline.py ... --split test            # only test split (skip Zeiss oneshot)
python run_eval_pipeline.py ... --downstream-source model   # re-run inference (needs GPU)
python run_eval_pipeline.py ... --no-downstream         # eval only

Outputs {run_dir}/eval/{test,oneshot}/... and {run_dir}/downstream/{split}_from_{predictions,model}/....

`downstream_measure_from_predictions.py` — From saved predictions

Loads YOLO polygon .txt files from a previous eval run; recomputes GT measurements from the annotation polygons every time so a different threshold always produces a fresh paired GT.

# Single run
python downstream_measure_from_predictions.py --predictions-dir output/runs/.../eval_test/predictions --strategy A --out-dir output/runs/.../downstream

# Batch over all runs missing test-set downstream
python downstream_measure_from_predictions.py --batch --strategy A --plot

Writes pred_measurements.csv and gt_measurements.csv (always recomputed) into the run's downstream dir.

`downstream_measure_from_masks.py` — Source-agnostic

Walks a generic image directory and pairs each sample with a mask file from any directory. No GT-vs-pred logic; outputs a single measurements CSV. Use this when you want to measure any mask folder (GT, predictions, hand-corrected, etc.) without the comparison machinery.

# Measure GT
python downstream_measure_from_masks.py --image-dir data/image --mask-dir data/annotation --out-csv output/downstream/gt_all.csv

# Measure a model's predictions
python downstream_measure_from_masks.py --image-dir data/image --mask-dir output/runs/.../eval/test/predictions --out-csv output/downstream/pred_all.csv

--image-dir must contain the standard {Species}/{Microscope}/{Exp}/{Sample}/{Sample}_{TRITC,FITC,DAPI}.tif layout. --mask-dir is any flat directory containing {Species}_{Microscope}_{Exp}_{Sample}.txt files in the project's 6-class YOLO polygon format.

Arguments:

Argument	Default	Description
`--image-dir`	(required)	Root image directory.
`--mask-dir`	(required)	Directory of YOLO polygon `.txt` files.
`--out-csv`	(required)	Output measurements CSV path.
`--save-vis-dir`	—	Save per-sample threshold diagnostic PNGs into this dir.
`--tritc-threshold`	—	Global TRITC keep-range (see below).
`--fitc-threshold`	—	Global FITC keep-range.
`--threshold`	— (repeat)	Per-structure keep-range, e.g. `Exodermis:TRITC=4000-5000`.

Intensity thresholding (all three scripts)

Each (region, channel) intensity measurement can be optionally restricted to pixels whose raw value falls in a keep-range [low, high] (inclusive). Pixels outside the range are excluded from the average. Default is no thresholding.

Range syntax is required as LOW-HIGH — bare numbers are rejected to avoid ambiguity:

Spec	Pixels kept
`4000-5000`	`4000 ≤ value ≤ 5000`
`min-5000`	`value ≤ 5000`
`5000-max`	`value ≥ 5000`

Three flags, all optional:

--tritc-threshold RANGE — applies the same TRITC range to Exodermis, Endodermis, Vascular.
--fitc-threshold RANGE — same for FITC.
--threshold REGION:CHANNEL=RANGE — repeatable; overrides the global flag for that specific (region, channel). REGION ∈ {Exodermis, Endodermis, Vascular}, CHANNEL ∈ {TRITC, FITC}.

# Same threshold across all three structures
... --tritc-threshold 4000-5000 --fitc-threshold min-800

# Per-structure
... --threshold Exodermis:TRITC=4000-5000 --threshold Endodermis:TRITC=3000-max --threshold Vascular:FITC=min-800

# Global default + per-structure override
... --tritc-threshold 5000-max --threshold Exodermis:TRITC=8000-max

Threshold visualization (`downstream_measure_from_masks.py` only)

When --save-vis-dir DIR is set together with one or more thresholds, the script saves a diagnostic PNG per (sample, channel-with-threshold) at DIR/{uid}_{CHANNEL}.png. Each PNG has a title bar listing the per-region ranges, plus four panels side by side:

Mask before threshold overlay — original Exodermis/Endodermis/Vascular masks blended over the raw channel image.
Mask before threshold — the same masks alone, on a black background.
Mask after threshold — only the pixels that fall in the keep-range, alone on black.
Mask after threshold overlay — kept pixels blended over the raw channel image.

The CSV's reported intensity is the mean over the kept pixels (panel 3 / panel 4).

If --save-vis-dir is set but no threshold is given, the script warns and skips visualization. Channels with no threshold are skipped; e.g. setting only TRITC ranges produces no FITC PNGs.

4. `polygon_editor.py` — Interactive Annotation Editor

GUI tool for visualizing, correcting, and creating YOLO polygon annotations.

# Launch with a data directory
python polygon_editor.py --data-dir data/

# Launch with generic (non-structured) data
python polygon_editor.py --data-dir path/to/new_data/

# Launch without arguments (use Browse button to select folder)
python polygon_editor.py

Modes (select from the Mode dropdown):

Mode	Panels	Required folders	Description
Correct GT	3 (Original, GT, Prediction)	`image/`, `annotation/`, `prediction/`	Edit ground truth with predictions as reference
Correct Predictions	2 (Original, Editable)	`image/`, `prediction/`	Edit predictions, save to `annotation/`
Create GT	2 (Original, Editable)	`image/`	Draw annotations from scratch, save to `annotation/`

Controls:

Key	Action
`A` / `Left`	Previous sample
`D` / `Right`	Next sample
`N`	Start drawing new polygon with nodes (click to add points)
`B`	Enter brush mode (erase default, Shift=add, Ctrl+scroll=size)
`E`	Edit selected polygon with brush (same as B on selection)
`Enter`	Confirm drawing or edits
`Escape`	Cancel drawing or edits (reverts all changes)
`Delete` / `Backspace`	Delete selected vertex (edit mode) or polygon
`S`	Save annotations to file
`Ctrl+C`	Copy selected reference polygon to editable panel
`C`	Copy ALL reference polygons to editable panel
`1`-`4`	Set class for new polygon
`Ctrl+Z` / `Ctrl+Shift+Z`	Undo / Redo
Mouse wheel	Zoom in/out
Middle/Right drag	Pan the image
`H`	Reset zoom and center all panels

Vertex editing: Drag vertices to move them. Hover over an edge to see a green "+" marker; click to add a vertex. Select a vertex and press Delete to remove it.

Saving: Press S to save. Annotations are saved in YOLO polygon format to {data-dir}/annotation/ (created automatically if it does not exist).

Typical Workflow

# 1. Run all Strategy A benchmark models (trains + evaluates all 5 runs)
python train/run_grid_training.py

# 2. Or train a single model
python train/train_unet.py --mode multilabel --strategy A --num-classes 5 --mask-missing

# 3. Evaluate on test split
python evaluate.py --model unet --unet-mode multilabel --checkpoint path/to/best.ckpt \
    --strategy A --num-classes 5

# 4. Generate predictions on new data
python predict.py --data-dir path/to/new_data/ --checkpoint path/to/best.pt

# 5. Run downstream analysis
python analyze_downstream.py --data-dir data/ --source both

# 6. Review and correct predictions
python polygon_editor.py --data-dir path/to/new_data/

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.vscode		.vscode
scripts		scripts
src		src
train		train
.DS_Store		.DS_Store
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
analyze_downstream.py		analyze_downstream.py
dataset_split.xlsx		dataset_split.xlsx
downstream_measure_from_masks.py		downstream_measure_from_masks.py
downstream_measure_from_model.py		downstream_measure_from_model.py
downstream_measure_from_predictions.py		downstream_measure_from_predictions.py
downstream_plot_correlations.py		downstream_plot_correlations.py
environment.yml		environment.yml
eval_bio7.py		eval_bio7.py
polygon_editor.py		polygon_editor.py
predict.py		predict.py
requirements.txt		requirements.txt
run_eval_pipeline.py		run_eval_pipeline.py
tomato_samples.xlsx		tomato_samples.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Plant Root Segmentation Pipeline

Project Structure

Data Directory Layout

Training

Strategy A Benchmark Grid (Runs 1-5)

Annotation Class Index

Annotation classes (raw, in `.txt` files)

Target classes (derived, used for model training)

Annotation counts per species

Strategy A Data Split

Configurable Training Parameters

Scripts

1. `predict.py` — Inference + Save Predictions

2. `evaluate.py` — Model Evaluation

3. Downstream Biological Analysis

`run_eval_pipeline.py` — Eval + downstream + plots

`downstream_measure_from_predictions.py` — From saved predictions

`downstream_measure_from_masks.py` — Source-agnostic

Intensity thresholding (all three scripts)

Threshold visualization (`downstream_measure_from_masks.py` only)

4. `polygon_editor.py` — Interactive Annotation Editor

Typical Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Plant Root Segmentation Pipeline

Project Structure

Data Directory Layout

Training

Strategy A Benchmark Grid (Runs 1-5)

Annotation Class Index

Annotation classes (raw, in .txt files)

Target classes (derived, used for model training)

Annotation counts per species

Strategy A Data Split

Configurable Training Parameters

Scripts

1. predict.py — Inference + Save Predictions

2. evaluate.py — Model Evaluation

3. Downstream Biological Analysis

run_eval_pipeline.py — Eval + downstream + plots

downstream_measure_from_predictions.py — From saved predictions

downstream_measure_from_masks.py — Source-agnostic

Intensity thresholding (all three scripts)

Threshold visualization (downstream_measure_from_masks.py only)

4. polygon_editor.py — Interactive Annotation Editor

Typical Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Annotation classes (raw, in `.txt` files)

1. `predict.py` — Inference + Save Predictions

2. `evaluate.py` — Model Evaluation

`run_eval_pipeline.py` — Eval + downstream + plots

`downstream_measure_from_predictions.py` — From saved predictions

`downstream_measure_from_masks.py` — Source-agnostic

Threshold visualization (`downstream_measure_from_masks.py` only)

4. `polygon_editor.py` — Interactive Annotation Editor

Packages