Skip to content

meghanaNanuvala/Diffusion-Personalization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Diffusion Personalization

A comparative study of six personalization methods for text-to-image diffusion models, conducted as an independent study at Indiana University under the guidance of Professor Mohammad Al Hasan.

View the Interactive Report — Detailed results, architecture diagrams, generated samples, and method comparisons.

All experiments use Stable Diffusion v1.4/v1.5 on NVIDIA H100 80 GB GPUs (IU Quartz HPC) with HuggingFace Diffusers and TRL.

Overview

This project systematically evaluates how different personalization strategies affect image quality, subject fidelity, and inference speed. Each method occupies a different point on the fidelity-efficiency spectrum:

  • DreamBooth — Full U-Net fine-tuning for maximum subject fidelity (~3.4 GB)
  • LoRA — Low-rank adapters for efficient style transfer (~3 MB)
  • Textual Inversion — Embedding-only optimization, lightest approach (~3-24 KB)
  • Custom Diffusion — Selective K,V cross-attention fine-tuning with multi-concept support (~75 MB)
  • LCM Distillation — Consistency distillation for 27x faster inference (108 ms vs 2,913 ms)
  • DDPO — Reinforcement learning with aesthetic reward optimization (score: 6.28)

Methods

Method What's Trained Trainable Params Storage
DreamBooth Entire U-Net ~860M ~3.4 GB
LoRA Low-rank adapters ~1.6-6.4M ~3 MB
Textual Inversion Embedding only ~768-6,144 ~3-24 KB
Custom Diffusion Cross-attn K,V + token ~57M ~75 MB
LCM Distillation Student model Full model ~3.4 GB
DDPO U-Net (RL) Full model ~3.4 GB

Repository Structure

Diffusion-Personalization/
├── jobs/                          # SLURM job submission scripts
│   ├── submit_train.sh
│   └── submit_train2.sh
├── models/
│   └── experiment_tracker/        # Experiment logs (per-method CSVs)
│       ├── DB.csv                 # DreamBooth experiments
│       ├── LORA.csv               # LoRA experiments
│       ├── TI.csv                 # Textual Inversion experiments
│       ├── CD.csv                 # Custom Diffusion experiments
│       ├── LCD.csv                # LCM Distillation experiments
│       └── RL_DDPO.csv            # DDPO experiments
├── inputs/
│   └── prompts/                   # Evaluation prompt files
│       ├── prompts_dog.txt        # DreamBooth prompts
│       ├── prompts_dog_CD.txt     # Custom Diffusion prompts
│       ├── prompts_cat.txt        # Textual Inversion prompts
│       ├── prompts_naruto.txt     # LoRA style transfer prompts
│       ├── prompts_lcm.txt        # LCM general prompts
│       ├── prompts_ddpo.txt       # DDPO aesthetic prompts
│       └── prompts_offsubject.txt # Off-subject prompts
├── scripts/
│   ├── train/                     # Training scripts (per method)
│   ├── infer/                     # Inference scripts (per method)
│   ├── experiment.py              # Experiment tracker
│   ├── experiment_after_evaluation.py
│   ├── experiment_after_training.sh
│   ├── eval_infer.py              # General evaluation inference
│   ├── eval_infer_ti.py           # TI-specific evaluation
│   ├── evaluation.py              # Metric computation (CLIP-T, CLIP-I, DINO-I)
│   ├── evaluation_ddpo.py         # DDPO evaluation (aesthetic + CLIP-T)
│   └── evaluation_lcm.py          # LCM evaluation (CLIP-T + latency)
├── .gitignore
└── README.md

Prerequisites

  • Python 3.10+
  • CUDA 12.6+
  • PyTorch 2.11+

Setup

1. Clone this repository

git clone https://github.com/meghanaNanuvala/Diffusion-Personalization.git
cd Diffusion-Personalization

2. Clone HuggingFace Diffusers

The Diffusers library is required for all training and inference scripts. It is not included in this repo due to size constraints.

git clone https://github.com/huggingface/diffusers.git
cd diffusers
pip install -e .
cd ..

3. Clone TRL (for DDPO only)

The TRL library is required for DDPO reinforcement learning training.

git clone https://github.com/huggingface/trl.git
cd trl
pip install -e .
cd ..

4. Install additional dependencies

pip install transformers accelerate safetensors
pip install open-clip-torch        # For CLIP-T and CLIP-I metrics
pip install torchvision            # For DINO-I metrics

Running Experiments

Training

Training scripts are in scripts/train/. Each method has its own training script. Example for DreamBooth:

# On SLURM (IU Quartz HPC)
sbatch jobs/submit_train.sh

# Or run directly
python scripts/train/train_dreambooth.py

Inference and Evaluation

# Generate images from a trained model
python scripts/eval_infer.py

# Compute metrics (CLIP-T, CLIP-I, DINO-I)
python scripts/evaluation.py

# LCM-specific evaluation (CLIP-T + latency)
python scripts/evaluation_lcm.py

# DDPO-specific evaluation (aesthetic score + CLIP-T)
python scripts/evaluation_ddpo.py

Experiment Tracking

All experiments are automatically logged to CSV files in models/experiment_tracker/. Each row records hyperparameters, hardware context, and evaluation metrics.

Key Results

Method Best CLIP-T Best CLIP-I Best DINO-I Latency
DreamBooth 0.274 0.845 0.588 2,913 ms
LoRA 0.249 - - 835 ms
Textual Inversion 0.277 0.857 0.690 613 ms
Custom Diffusion 0.258 0.741 0.081 644 ms
LCM 0.252 - - 108 ms
DDPO 0.232 - - 1,187 ms

DDPO achieves an aesthetic score of 6.28 (LoRA variant). LCM achieves a 27x inference speedup (108 ms vs 2,913 ms for DreamBooth).

For detailed per-prompt breakdowns, generated samples, and architecture diagrams, see the interactive report.

Hardware

  • GPU: NVIDIA H100 80 GB HBM3
  • Cluster: IU Quartz HPC
  • CUDA: 12.6
  • PyTorch: 2.11

Author

Meghana Nanuvala - Indiana University

Independent study under Professor Mohammad Al Hasan, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis.

About

Comparative study of six diffusion model personalization methods (DreamBooth, LoRA, Textual Inversion, Custom Diffusion, LCM, DDPO) using HuggingFace Diffusers on Stable Diffusion v1.5 | NVIDIA H100 | IU Quartz HPC

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors