VideoEraser: Concept Erasure in Text-to-Video Diffusion Models

📋 Table of Contents

Overview
Installation
Quick Start
Acknowledgement
Citation

📌 Overview

VideoEraser is a training-free framework that prevents Text-to-Video (T2V) diffusion models from generating videos with undesirable concepts, even when explicitly prompted. It achieves state-of-the-art performance in suppressing undesirable content during T2V generation, reducing it by 46% on average across four tasks compared to baselines.

Besides, VideoEraser is applicable to multiple T2V diffusion models, including UNet-based models – AnimateDiff, LaVie, ZeroScope, ModelScope, and a DiT-based model – CogVideoX.

Video Demonstrations

	Object Erasure	Celebrity Erasure	Artistic Style Erasure	Explicit Content Erasure
Task	Object Erasure	Celebrity Erasure	Artistic Style Erasure	Explicit Content Erasure
Original Prompt	A rally car racing through a snowy forest path.	Jackie Chan is tai chi.	A man running under starry night by Van Gogh.	A naked man is playing basketball.
Original Video	object.mp4	celebrity.mp4	artist.mp4	explicit.mp4
Erased Concept	rally car	Jackie Chan	Van Gogh	naked
Erased Video	object_removal.mp4	celebrity_removal.mp4	artist_removal.mp4	explicit_removal.mp4

📢 News

[2025.11] 🎉 Our paper "VideoEraser: Concept Erasure in Text-to-Video Diffusion Models" has been accepted to EMNLP 2025 Main Conference!

🔧 Installation

Setup

Option 1: AnimateDiff

git clone https://github.com/bluedream02/VideoEraser.git
cd VideoEraser/AnimateDiff

# Create environment
conda create -n animatediff python=3.10
conda activate animatediff
pip install -r requirements.txt

Download Pre-trained Models:

python scripts/animate.py --pretrained-model-path stable-diffusion-v1-5/stable-diffusion-v1-5
cd Motion_Module
wget https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15.ckpt
cd ../..

Expected structure:

AnimateDiff/
├── models/
│   ├── stable-diffusion-v1-5/        # Stable Diffusion base model
│   │   ├── ...
│   └── Motion_Module/
│       └── mm_sd_v15.ckpt            # AnimateDiff motion module

Option 2: ModelScope (ZeroScope/ModelScope)

cd VideoEraser/ModelScope
conda create -n modelscope python=3.10
conda activate modelscope
pip install -r requirements.txt

Download Pre-trained Models:

mkdir -p models
cd models
git lfs install
git clone https://huggingface.co/cerspense/zeroscope_v2_576w
git clone https://huggingface.co/damo-vilab/text-to-video-ms-1.7b
cd ..

Expected structure:

ModelScope/
├── models/
│   ├── zeroscope_v2_576w/            # ZeroScope model
│   │   ├── ...
│   └── text-to-video-ms-1.7b/        # ModelScope model (alternative)
│       ├── ...

Option 3: LaVie

cd VideoEraser/Lavie
conda env create -f environment.yml
conda activate lavie

Download Pre-trained Models:

Download pre-trained LaVie models, Stable Diffusion 1.4, and stable-diffusion-x4-upscaler:

mkdir -p pretrained_models
cd pretrained_models
wget https://huggingface.co/Vchitect/LaVie/resolve/main/lavie_base.pt
wget https://huggingface.co/Vchitect/LaVie/resolve/main/lavie_interpolation.pt
wget https://huggingface.co/Vchitect/LaVie/resolve/main/lavie_vsr.pt
git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
git clone https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler
cd ..

Expected structure:

Lavie/
├── pretrained_models/
│   ├── lavie_base.pt                 # Base T2V model
│   ├── lavie_interpolation.pt        # Frame interpolation model
│   ├── lavie_vsr.pt                  # Video super-resolution model
│   ├── stable-diffusion-v1-4/        # SD 1.4 base model
│   │   ├── ...
│   └── stable-diffusion-x4-upscaler/ # SD x4 upscaler
│       ├── ...

Option 4: CogVideoX

cd VideoEraser/CogVideoX
conda create -n cogvideox python=3.10
conda activate cogvideox
pip install -r requirements.txt

Download Pre-trained Models:

mkdir -p models
cd models
git clone https://huggingface.co/THUDM/CogVideoX-5b
cd ..

Expected structure:

CogVideoX/
├── models/
│   ├── CogVideoX-5b/

🚀 Quick Start

AnimateDiff (UNet-based)

cd AnimateDiff
python scripts/animate.py \
    --pretrained-model-path stable-diffusion-v1-5/stable-diffusion-v1-5 \
    --prompt "A man running under starry night by Van Gogh." \
    --erased-concept "Van Gogh" \
    --output-dir ./outputs \
    --seed 42

See AnimateDiff/README.md for detailed usage.

ModelScope (UNet-based, ZeroScope/ModelScope)

cd ModelScope
# Simple usage with HuggingFace model
python inference.py \
    --model cerspense/zeroscope_v2_576w \
    --prompt "A man running under starry night by Van Gogh." \
    --erased-concept "Van Gogh" \
    --output ./outputs \
    --seed 42

# Or with ModelScope backbone
python inference.py \
    --model damo-vilab/text-to-video-ms-1.7b \
    --prompt "A man running under starry night by Van Gogh." \
    --erased-concept "Van Gogh" \
    --output ./outputs

See ModelScope/README.md for detailed usage.

LaVie (UNet-based)

cd Lavie/base
python pipelines/sample.py \
    --config configs/example.yaml \
    --text-prompt "A man running under starry night by Van Gogh." \
    --unlearn-prompt "Van Gogh" \
    --output-dir ./outputs \
    --seed 42

Note: LaVie now supports command-line arguments that override config file settings.

See Lavie/README.md for detailed usage.

CogVideoX (DiT-based)

cd CogVideoX
python cli_demo.py \
    --prompt "A man running under starry night by Van Gogh." \
    --unsafe_concept "Van Gogh" \
    --model_path THUDM/CogVideoX-2b \
    --output_path ./output.mp4

See CogVideoX/README.md for detailed usage.

Evaluation

We provide evaluation scripts for assessing concept erasure performance. The scripts process videos frame-by-frame: if any frame contains the target concept, the video is considered to contain that concept.

cd evaluation

# 1. Artistic Style Detection (requires OpenAI API)
export OPENAI_API_KEY="your_api_key"
export OPENAI_BASE_URL="https://api.openai.com/v1"  # Optional, defaults to OpenAI
python artist.py \
    --input-folder /path/to/videos \
    --output-folder ./results \
    --num-samples 5

# 2. Object Detection
python object.py \
    --input-folder /path/to/videos \
    --output-folder ./results \
    --target-objects cassette player \
    --num-samples 5

# 3. Explicit Content Detection (requires nudenet)
pip install nudenet
python explict.py \
    --input-folder /path/to/videos \
    --output-folder ./results \
    --num-samples 5

🙏 Acknowledgement

This work builds upon several excellent open-source projects:

AnimateDiff - Motion module for Stable Diffusion
Text-To-Video-Finetuning - ZeroScope and ModelScope training framework
LaVie - Video generation with cascaded diffusion models
CogVideoX - Large-scale text-to-video generation model
Stable Diffusion - Foundation text-to-image model
SEGA - Instructing Text-to-Image Models using Semantic Guidance
SAFREE - Safe and free text-to-image generation

We thank the authors for their valuable contributions to the community.

📖 Citation

If you find VideoEraser useful in your research, please cite:

@inproceedings{xu2025videoeraser,
  title={VideoEraser: Concept Erasure in Text-to-Video Diffusion Models},
  author={Xu, Naen and Zhang, Jinghuai and Li, Changjiang and Chen, Zhi and Zhou, Chunyi and Li, Qingming and Du, Tianyu and Ji, Shouling},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages={5965--5994},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VideoEraser: Concept Erasure in Text-to-Video Diffusion Models

📋 Table of Contents

📌 Overview

Video Demonstrations

📢 News

🔧 Installation

Setup

Option 1: AnimateDiff

Option 2: ModelScope (ZeroScope/ModelScope)

Option 3: LaVie

Option 4: CogVideoX

🚀 Quick Start

AnimateDiff (UNet-based)

ModelScope (UNet-based, ZeroScope/ModelScope)

LaVie (UNet-based)

CogVideoX (DiT-based)

Evaluation

🙏 Acknowledgement

📖 Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
AnimateDiff		AnimateDiff
CogVideoX		CogVideoX
Lavie		Lavie
ModelScope		ModelScope
evaluation		evaluation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

NESA-Lab/VideoEraser

Folders and files

Latest commit

History

Repository files navigation

VideoEraser: Concept Erasure in Text-to-Video Diffusion Models

📋 Table of Contents

📌 Overview

Video Demonstrations

📢 News

🔧 Installation

Setup

Option 1: AnimateDiff

Option 2: ModelScope (ZeroScope/ModelScope)

Option 3: LaVie

Option 4: CogVideoX

🚀 Quick Start

AnimateDiff (UNet-based)

ModelScope (UNet-based, ZeroScope/ModelScope)

LaVie (UNet-based)

CogVideoX (DiT-based)

Evaluation

🙏 Acknowledgement

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages