Skip to content

Mr-TalhaIlyas/CURL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ CURL: Contrastive Ultrasound Video Representation Learning

Badge Python PyTorch License

A novel self-supervised framework for fetal movement detection from extended ultrasound video recordings


🎯 Overview Paper

CURL (Contrastive Ultrasound Video Representation Learning) is a cutting-edge self-supervised framework designed specifically for fetal movement assessment from ultrasound videos. Our method employs a dual-contrastive loss that captures both spatial (anatomical) and temporal (motion-based) features, enabling robust representation learning for fetal movement dynamics.

πŸ”¬ Key Innovations

  • 🎭 Dual-Contrastive Learning: Combines spatial (SimCLR-style NT-Xent) and temporal contrastive objectives
  • 🎯 Task-Specific Sampling: Intelligent sampling strategy for movement vs. non-movement segments
  • πŸ”„ Flexible Inference: Supports ultrasound recordings of arbitrary length through probabilistic fine-tuning
  • πŸ—οΈ Modular Architecture: Support for both SlowFast and Vision Transformer (ViT) backbones
CURL Framework Overview

Pipeline Overview: Starting from expertly annotated ultrasound videos (A), CURL splits clips into spatiotemporal patches (B), uses transformer backbones with dual-contrastive learning to extract robust features, fine-tunes with lightweight classifiers (C), and delivers clinically reliable fetal movement detection (D).


πŸš€ Quick Start

πŸ“‹ Prerequisites

  • Python 3.8+
  • CUDA-capable GPU (recommended)
  • 16GB+ RAM for video processing

πŸ”§ Installation

  1. Clone the repository
git clone https://github.com/Mr-TalhaIlyas/CURL.git
cd CURL
  1. Create virtual environment
# Using conda (recommended)
conda create -n curl python=3.8
conda activate curl

# Or using pip
python -m venv curl
source curl/bin/activate  # Linux/Mac
curl\Scripts\activate     # Windows
  1. Install dependencies
# Using pip
pip install -r requirements.txt

# Or using conda
conda create --name curl --file requirements.txt

πŸ“ Data Preparation

  1. Organize your data structure:
data/
β”œβ”€β”€ videos/           # Raw ultrasound videos (.mp4)
β”œβ”€β”€ optical_flow/     # Optical flow videos (.mp4) 
β”œβ”€β”€ labels/           # Label files (.npy)
└── folds/           # Train/test split files
    β”œβ”€β”€ train_fold_1.txt
    β”œβ”€β”€ test_fold_1.txt
    └── ...
  1. Update configuration:
# In configs/config.py
config = dict(
    vid_dir = "path/to/videos/",
    flow_dir = "path/to/optical_flow/", 
    lbl_dir = "path/to/labels/",
    folds = "path/to/folds/"
)

πŸŽ“ Training

1. πŸ”„ Self-Supervised Pre-training

Choose between two backbone architectures:

SlowFast + Dual Contrastive Loss

# Train with both spatial and temporal losses
python dual_contrastive_main.py \
    --enable_temporal_loss \
    --spatial_loss_weight 1.0 \
    --temporal_loss_weight 0.5 \
    --dual_loss_mode both \
    --epochs 100

# Spatial-only training
python dual_contrastive_main.py \
    --spatial_loss_weight 1.0 \
    --dual_loss_mode spatial_only

Vision Transformer (ViT) + Dual Contrastive Loss

# MAE-style contrastive learning
python run_mae_contrastive.py \
    --enable_temporal_loss \
    --spatial_loss_weight 1.0 \
    --temporal_loss_weight 0.7 \
    --embed_dim 1024 \
    --depth 24

2. 🎯 Fine-tuning for Classification

# Fine-tune pre-trained contrastive model
python run_finetune.py \
    --model_type contrastive_mae \
    --checkpoint_path /path/to/pretrained_model.pth \
    --epochs 30 \
    --lr 2e-4 \
    --loss_type focal

# Fine-tune standard MAE model
python run_finetune.py \
    --model_type standard_mae \
    --checkpoint_path /path/to/mae_model.pth \
    --epochs 30

πŸ—οΈ Architecture

🎭 Dual-Contrastive Loss Framework

# Spatial Contrastive Loss (NT-Xent)
spatial_loss = NT_XentLoss(spatial_features_i, spatial_features_j)

# Temporal Contrastive Loss (TC)
temporal_loss = temporal_contrastive_loss(
    temporal_features_i, 
    temporal_features_j, 
    temperature, 
    clusters=8
)

# Combined Loss
total_loss = Ξ± * spatial_loss + Ξ² * temporal_loss

πŸ”§ Supported Architectures

Model Backbone Key Features
SimCLR + SlowFast SlowFast ResNet Two-stream processing for spatial-temporal features
Contrastive MAE Vision Transformer Patch-based processing with attention mechanisms
Hybrid Models Custom Combine benefits of both approaches

βš™οΈ Configuration

πŸ”§ Key Parameters

# Dual contrastive learning
enable_temporal_loss = True
spatial_loss_weight = 1.0
temporal_loss_weight = 0.5
temperature_spatial = 0.5
temperature_temporal = 0.1

# Temporal contrastive loss
tc_clusters = 8
tc_num_iters = 10
tc_do_entro = True  # Enable IID regularization

# Model architecture  
mae_contrastive = dict(
    embed_dim = 1024,
    depth = 24,
    num_heads = 16,
    projection_dim = 256,
    temporal_projection_dim = 128
)

πŸ“ Project Structure

CURL/
β”œβ”€β”€ πŸ“„ README.md
β”œβ”€β”€ πŸ“‹ requirements.txt
β”œβ”€β”€ πŸ“‚ scripts/
β”‚   β”œβ”€β”€ πŸ”§ configs/
β”‚   β”‚   └── config.py
β”‚   β”œβ”€β”€ πŸ“Š data/
β”‚   β”‚   β”œβ”€β”€ simclr_loader.py
β”‚   β”‚   β”œβ”€β”€ dataloader.py
β”‚   β”‚   └── utils.py
β”‚   β”œβ”€β”€ πŸ—οΈ models/
β”‚   β”‚   β”œβ”€β”€ mae/
β”‚   β”‚   β”œβ”€β”€ slowfast/
β”‚   β”‚   └── contrastive_mae.py
β”‚   β”œβ”€β”€ πŸ› οΈ tools/
β”‚   β”‚   β”œβ”€β”€ nt_xnet.py           # Spatial contrastive loss
β”‚   β”‚   β”œβ”€β”€ tc_loss.py           # Temporal contrastive loss  
β”‚   β”‚   └── simclr_training.py
β”‚   β”œβ”€β”€ πŸŽ“ Training Scripts
β”‚   β”‚   β”œβ”€β”€ main_simclr.py
β”‚   β”‚   β”œβ”€β”€ main_mae_contrastive.py
β”‚   β”‚   └── finetune_contrastive_mae.py
β”‚   └── πŸš€ Run Scripts
β”‚       β”œβ”€β”€ dual_contrastive_main.py
β”‚       β”œβ”€β”€ run_mae_contrastive.py
β”‚       └── run_finetune.py
└── πŸ“Έ screens/
    └── summary.jpg

πŸ”¬ Technical Details

🎯 Loss Functions

Spatial Contrastive Loss (NT-Xent)

  • Based on SimCLR framework
  • Learns anatomical feature representations
  • Temperature-scaled InfoNCE loss

Temporal Contrastive Loss (TC)

  • Novel clustering-based approach
  • Learns motion dynamics
  • Combines Cross-Level Distillation (CLD) and IID regularization

🎭 Data Augmentation

  • Spatial: Random cropping, color jittering, Gaussian blur
  • Temporal: Frame dropping, temporal jittering
  • Domain-specific: Ultrasound-aware transformations

πŸ“š Citation

If you find this work useful, please cite our paper:

Paper is currently under review.

πŸ› Issues

Found a bug? Please open an issue with:

  • Detailed description
  • Steps to reproduce
  • Environment details
  • Expected vs actual behavior

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • Thanks to the medical imaging community for inspiration
  • Built upon excellent work in self-supervised learning
  • Special thanks to SimCLR and MAE teams

Releases

No releases published

Packages

 
 
 

Contributors

Languages