Multimodal deep learning framework, datasets, and models for plankton identification.
Part of Inria Challenge OcΓ©anIA.
planktonzilla is a framework for managing datasets, training computer vision models, and evaluating performance on various plankton image identification tasks. Built on top of Hugging Face Transformers and Hydra for configuration management, it offers specialized tools for handling imbalanced plankton datasets and state-of-the-art imbalance learning loss functions.
Highlights:
-
planktonzilla-17Mdataset: 17 million plankton images from 9 different datasets, all standardized and preprocessed for deep learning applications. Available: https://huggingface.co/datasets/project-oceania/planktonzilla-17m. -
OcΓ©anIA project website: https://oceania.inria.cl.
-
OcΓ©anIA on Hugging Face Hub (datasets, trained models, and demos): https://huggingface.co/project-oceania.
The published planktonzilla models are landing in the v1 release. The snippet below is the target API every v1 model will conform to β a single universal from_pretrained call that works for the entire model collection, no clone of this repository required.
from transformers import AutoModelForImageClassification, AutoImageProcessor
from PIL import Image
model_id = "project-oceania/<model-name>" # see https://huggingface.co/project-oceania
processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageClassification.from_pretrained(model_id, trust_remote_code=True)
image = Image.open("plankton.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
predicted_idx = outputs.logits.argmax(-1).item()
print(model.config.id2label[predicted_idx])- Browse published models: https://huggingface.co/project-oceania
- No clone of this repo is required β
pip install transformers pillowis the consumer dependency surface. - v1 status: pre-trained models are being prepared for release; see project status and the HF org page above for current availability.
- Modular Configuration: Hydra-based hierarchical configuration.
- Multiple Plankton Dataset Support: Built-in support for all (afawk) plankton image datasets.
- Specialized Loss Functions to handle class imbalance: Advanced loss functions for imbalanced classification (Focal, LDAM, Asymmetric, etc.)
- Model Hub Integration: Seamless integration with Hugging Face Hub for model sharing
- Experiment Tracking: Built-in support for Weights & Biases, MLFlow, and Trackio.
- Flexible Training Pipeline: Based on Hugging Face Transformers Trainer with custom enhancements.
- Easy CLI Interface: Simple command-line tools for all operations.
flowchart LR
%% ββ Configuration (shared across both pipelines) βββββββββββββββββ
subgraph CFG_GRP["βοΈ Hydra Config Tree (configs/)"]
direction TB
CFG_TRAIN[train.yaml]
CFG_IMPORT[import_dataset.yaml]
CFG_MODEL[model/]
CFG_DATA[dataset/]
CFG_AUG[augmentation/]
CFG_LOSS[custom_loss/]
CFG_PEFT[peft/ β LoRA]
CFG_HPARAM[hparams_search/ β Optuna]
CFG_TRACK[tracking/]
CFG_TRAIN_ARGS[training_arguments/]
end
%% ββ Stage 1: Data Ingestion ββββββββββββββββββββββββββββββββββββββ
subgraph INGEST["π₯ Data Ingestion Β· pz_import_dataset"]
direction LR
RAW[Raw plankton sources<br/>WHOI / EcoTaxa / public]
IMPORTER[dataset_import/<br/>dataset_importer.py]
PUSH_DS[push to HF Hub]
RAW --> IMPORTER --> PUSH_DS
end
%% ββ Stage 2: Training & Evaluation βββββββββββββββββββββββββββββββ
subgraph TRAIN["ποΈ Training Β· pz_train"]
direction TB
DATA[dataset.py<br/>+ augmentation pipeline]
MODEL_HF[HF / timm classifier]
MODEL_CLIP[clip_model.py<br/>OpenCLIP backbone]
LOSS[loss.py Β· custom losses]
PEFT_ADAPT[PEFT / LoRA adapters]
TRAINER[HF Trainer loop]
HPARAM[Optuna sweep<br/>optional]
DATA --> TRAINER
MODEL_HF --> TRAINER
MODEL_CLIP --> TRAINER
LOSS --> TRAINER
PEFT_ADAPT --> TRAINER
HPARAM -.->|trials| TRAINER
end
%% ββ Outputs ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
HF_HUB[(π€ HF Hub<br/>datasets Β· checkpoints)]
TRACK_BE[W&B Β· MLflow Β· Trackio]
LOGS[logs/ Β· checkpoints/<br/>wandb/]
%% ββ Orchestration ββββββββββββββββββββββββββββββββββββββββββββββββ
subgraph ORCH["π°οΈ Orchestration"]
direction TB
SLURM[scripts/*.sh<br/>SLURM Β· torchrun multi-node]
DEVCT[.devcontainer/<br/>CUDA 12.5 Β· Python 3.12]
TESTS[tests/ Β· pytest]
end
%% ββ Wiring βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CFG_GRP -. merged config .-> INGEST
CFG_GRP -. merged config .-> TRAIN
PUSH_DS --> HF_HUB
HF_HUB -->|load_dataset| DATA
TRAINER -->|metrics| TRACK_BE
TRAINER -->|checkpoints + logs| LOGS
TRAINER -.->|push_to_hub| HF_HUB
SLURM -->|launch| INGEST
SLURM -->|launch| TRAIN
DEVCT -.->|env for| TRAIN
TESTS -.->|validate| TRAIN
classDef hub fill:#fef3c7,stroke:#d97706,stroke-width:2px
classDef cfg fill:#e0e7ff,stroke:#4338ca
class HF_HUB hub
class CFG_GRP cfg
planktonzilla/
βββ configs/ # Hydra configuration files
β βββ dataset/ # Dataset-specific configs
β βββ model/ # Model architecture configs
β βββ training_arguments/ # Training hyperparameters
β βββ augmentation/ # Data augmentation strategies
β βββ custom_loss/ # Loss function configurations
β βββ tracking/ # Experiment tracking setup
βββ planktonzilla/ # Main package
β βββ dataset.py # Dataset loading and preprocessing
β βββ train.py # Training pipeline
β βββ loss.py # Custom loss functions
β βββ clip_model.py # CLIP-based model wrapper
β βββ dataset_import/ # Dataset import utilities
β βββ utils/ # Logging, Hydra helpers
βββ tests/ # Test suite
- Python 3.11-3.14
- uv for dependency management
- CUDA-compatible GPU (recommended for training)
# Clone the repository
git clone https://github.com/Inria-Chile/planktonzilla.git
cd planktonzilla
# Install dependencies (creates .venv automatically)
uv sync
# Install with development dependencies
uv sync --group dev
# Activate the virtual environment (optional β `uv run` works without it)
source .venv/bin/activateuv run <command> runs any project script inside the project venv without needing
to activate it manually. If you prefer an activated shell, run
source .venv/bin/activate.
# Import ISIISNET dataset
uv run pz_import_dataset dataset_import=isiisnet
# Import other available datasets
uv run pz_import_dataset dataset_import=flowcamnet
uv run pz_import_dataset dataset_import=lensless# Basic training with default configuration
uv run pz_train
# Train with specific dataset and model
uv run pz_train dataset=isiisnet model=resnet18
# Use specialized loss for imbalanced data
uv run pz_train dataset=isiisnet model=resnet50 custom_loss=focal
# Override training parameters
uv run pz_train dataset=isiisnet model=resnet18 training_arguments.num_train_epochs=10 training_arguments.learning_rate=1e-4Planktonzilla uses Hydra for hierarchical configuration management. You can override any configuration parameter:
# Use different model architecture
uv run pz_train model=efficientnet
# Apply different augmentation strategy
uv run pz_train augmentation=autoaugment
# Combine multiple overrides
uv run pz_train dataset=isiisnet model=resnet50 custom_loss=ldam training_arguments.learning_rate=1e-4The training pipeline composes Hydra-configured datasets, models, and losses through the Hugging Face Trainer, then publishes the resulting checkpoint to the Hub β where external users load it with AutoModelForImageClassification.from_pretrained.
flowchart TB
subgraph Configure["1 Β· Configure"]
direction TB
CLI["CLI<br/>pz_import_dataset Β· pz_train"]:::entry
CFG["Hydra configs<br/>configs/"]:::cfg
end
subgraph Ingest["2 Β· Ingest"]
direction TB
DATA_IMPORT["planktonzilla/dataset_import/<br/>DatasetImporter subclasses"]:::code
HF_DATA[("HF Hub<br/>project-oceania datasets")]:::ext
end
subgraph Train["3 Β· Train"]
direction TB
DATA["planktonzilla/dataset.py<br/>DatasetWrapper"]:::code
MODEL["Model<br/>timm Β· HF Β· open_clip"]:::code
LOSS["planktonzilla/loss.py<br/>AbstractHFLoss subclasses"]:::code
TRAIN_LOOP["HF Trainer<br/>planktonzilla/train.py"]:::code
TRACK["Tracking<br/>W&B Β· MLflow Β· trackio"]:::ext
OUTPUTS["Local outputs<br/>logs/ Β· checkpoints/"]:::code
end
subgraph Publish["4 Β· Publish"]
direction TB
HF_MODEL[("HF Hub<br/>project-oceania models")]:::ext
end
SCRIPTS["scripts/*.sh<br/>SLURM launchers"]:::code
TESTS["tests/<br/>smoke runs"]:::code
CONSUMER(["AutoModelForImageClassification<br/>.from_pretrained"]):::consumer
CLI --> CFG
CFG -.->|configures| DATA_IMPORT
CFG -.->|configures| TRAIN_LOOP
CFG -.->|selects| MODEL
CFG -.->|selects + params| LOSS
- ISIISNET: In-Situ Ichthyoplankton Imaging System Network
- FlowCamNet: FlowCam plankton dataset
- Lensless: Lensless plankton microscopy dataset
- UVP6Net: Underwater Vision Profiler 6 dataset
- WHOI-Plankton: Woods Hole Oceanographic Institution plankton dataset
- ZooLake: Lake Zurich zooplankton dataset
- ZooScanNet: ZooScan plankton dataset
- JEDI-Oceans: JEDI oceanic plankton dataset
- CIFAR-10: Generic image classification benchmark (sanity-check / smoke-test runs)
Planktonzilla includes specialized loss functions designed for imbalanced plankton classification:
- FocalLoss: Addresses class imbalance through dynamic loss weighting
- LDAMLoss: Label-Distribution-Aware Margin loss
- AsymmetricLoss: For multi-label classification scenarios
- RobustAsymmetricLoss: Enhanced version of asymmetric loss
- MaximumMarginLoss: Margin-based learning approach
- BalancedMetaSoftmaxLoss: Meta-learning approach for class balance
Integrate with popular experiment tracking tools:
# Enable Weights & Biases tracking
uv run pz_train tracking.use_wandb=true
# Enable MLflow tracking
uv run pz_train tracking.use_mlflow=true
# Enable Trackio
uv run pz_train tracking.use_trackio=true# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=planktonzilla
# Run specific test file
uv run pytest tests/test_datasets.py# Lint code
uv run ruff check
# Format code
uv run ruff format- Create a dataset configuration in
configs/dataset/your_dataset.yaml - Ensure your dataset is available on Hugging Face Hub
- Test with:
uv run pz_train dataset=your_dataset
- Implement your loss class inheriting from
AbstractHFLossinplanktonzilla/loss.py - Add configuration file in
configs/custom_loss/your_loss.yaml - Loss functions must handle
ImageClassifierOutputWithNoAttentioninput format - Test with:
uv run pz_train custom_loss=your_loss
We welcome contributions to Planktonzilla! Please feel free to:
- Report bugs and request features via GitHub Issues
- Submit pull requests for improvements
- Add new datasets or model architectures
- Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
If you use Planktonzilla in your research, please cite:
@report{contrerasmontanares:hal-05621003,
title = {Planktonzilla: Multimodal dataset and models for understanding plankton ecosystems},
author = {Contreras Montanares, Alan Gerson and Valenzuela, Luis and Mart{\'i}, Luis and Sanchez-Pi, Nayat},
year = 2026,
month = {May},
url = {https://hal.science/hal-05621003},
note = {Submitted to NeurIPS 2026.},
keywords = {Explainable AI; XAI ; Plankton Classification ; CLIPS ; Multimodal Classification},
pdf = {https://hal.science/hal-05621003v1/file/neurips\_2026.pdf},
eprint = {hal-05621003},
eprinttype = {hal},
hal_id = {hal-05621003},
hal_version = {v1}
}