ChordPrompt

Official implementation of ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-domain Incremental Learning in CLIP, accepted at ECML-PKDD 2025.

This repository focuses on multi-domain task-incremental learning with CLIP-based prompt tuning.

Overview

The current codebase is organized around a few core modules:

src/main.py: entry point for training.
src/models/prompt_tune.py: prompt-tuning training loop.
src/models/prompt_factory.py: unified prompt-model construction.
src/models/evaluation.py: single-checkpoint evaluation during training.
src/general_eval.py: batch evaluation across saved checkpoints.
src/task_retrieval.py: task retrieval logic used at inference time.
custom_clip/PromptCross.py: ChordPrompt model definition and prompt/prototype memory.
scripts/*.sh: reproduction and ablation scripts.

Environment

Install dependencies with:

pip install -r requirements.txt

Data Preparation

Required datasets:

ImageNet
Conceptual_Captions

Target datasets used in the paper:

Aircraft
Caltech101
CIFAR10
CIFAR100
DTD
EuroSAT
Flowers
Food
MNIST
OxfordPet
StanfordCars
SUN397

By default, the code reads data from D:/dataset/. Override this with --data-location.

Training

Train a single task from scratch:

python -m src.main \
  --train-mode prompt \
  --trainer CPrompt \
  --train-dataset Aircraft \
  --eval-datasets Aircraft \
  --data-location D:/dataset/ \
  --batch-size 64 \
  --batch-size-eval 64 \
  --num-workers 4 \
  --lr 2e-3 \
  --iterations 1000 \
  --prompt_width 2 \
  --prompt_depth_vision 12 \
  --prompt_depth_text 12 \
  --task-query-mode hybrid \
  --task-query-batches 2 \
  --task-query-topk 3 \
  --task-query-conf-weight 0.2 \
  --save ckpt/exp_CPrompt_2_12

Continue training on the next task:

python -m src.main \
  --train-mode prompt \
  --trainer CPrompt \
  --train-dataset Caltech101 \
  --eval-datasets Caltech101 \
  --data-location D:/dataset/ \
  --batch-size 64 \
  --batch-size-eval 64 \
  --num-workers 4 \
  --lr 2e-3 \
  --iterations 1000 \
  --prompt_width 2 \
  --prompt_depth_vision 12 \
  --prompt_depth_text 12 \
  --task-query-mode hybrid \
  --task-query-batches 2 \
  --task-query-topk 3 \
  --task-query-conf-weight 0.2 \
  --save ckpt/exp_CPrompt_2_12 \
  --load ckpt/exp_CPrompt_2_12/Aircraft.pth

Few-shot training example:

python -m src.main \
  --train-mode prompt \
  --trainer CPrompt \
  --train-dataset Aircraft \
  --eval-datasets Aircraft \
  --few-shot 5 \
  --data-location D:/dataset/ \
  --batch-size 16 \
  --batch-size-eval 16 \
  --num-workers 4 \
  --lr 2e-3 \
  --iterations 500 \
  --eval-interval 500 \
  --prompt_width 2 \
  --prompt_depth_vision 12 \
  --prompt_depth_text 12 \
  --task-query-mode hybrid \
  --save ckpt/exp_CPrompt_fewshot_2_12

Evaluation

Evaluate a saved continual-learning run:

python -m src.general_eval \
  --eval_names exp_CPrompt_2_12 \
  --trainer CPrompt \
  --data-location D:/dataset/ \
  --batch-size 64 \
  --batch-size-eval 64 \
  --num-workers 4 \
  --prompt_width 2 \
  --prompt_depth_vision 12 \
  --prompt_depth_text 12 \
  --task-query-mode hybrid \
  --task-query-batches 2 \
  --task-query-topk 3 \
  --task-query-conf-weight 0.2

Results are saved to results/<eval_names>.npy.

Retrieval Modes

Inference-time prompt retrieval now supports multiple modes:

hybrid: recommended default. Uses a small set of unlabeled test images to build a proxy text query, retrieves top candidates from the text prototype pool, and reranks them with prompt confidence.
proxy_text: image-conditioned proxy-text retrieval only.
legacy_text: previous text-to-text lookup behavior. Useful only for compatibility or ablation.

For current experiments, use --task-query-mode hybrid.

Reproducing Paper Runs

The main scripts are:

scripts/CPrompt.sh
scripts/CPrompt-fewshot.sh
scripts/CPrompt-ablationdepth.sh
scripts/CPrompt-ablationlength.sh

Example:

bash scripts/CPrompt.sh

Notes on Checkpoints

New checkpoints save prototype_valid together with prompt/prototype pools.
Older checkpoints can still be loaded, but for the cleanest retrieval behavior it is better to re-save or re-train with the current code.

Citation

If you find this repository useful, please cite:

@inproceedings{wang2025chordprompt,
  title={ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-domain Incremental Learning in CLIP},
  author={Wang, Zhiyuan and Chen, Bokui},
  booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
  pages={147--164},
  year={2025},
  organization={Springer}
}

Acknowledgement

This project builds on wise-ft and Continual-CLIP. Thanks to the original authors for releasing their code.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
custom_clip		custom_clip
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChordPrompt

Overview

Environment

Data Preparation

Training

Evaluation

Retrieval Modes

Reproducing Paper Runs

Notes on Checkpoints

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ChordPrompt

Overview

Environment

Data Preparation

Training

Evaluation

Retrieval Modes

Reproducing Paper Runs

Notes on Checkpoints

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages