Official implementation of ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-domain Incremental Learning in CLIP, accepted at ECML-PKDD 2025.
This repository focuses on multi-domain task-incremental learning with CLIP-based prompt tuning.
The current codebase is organized around a few core modules:
src/main.py: entry point for training.src/models/prompt_tune.py: prompt-tuning training loop.src/models/prompt_factory.py: unified prompt-model construction.src/models/evaluation.py: single-checkpoint evaluation during training.src/general_eval.py: batch evaluation across saved checkpoints.src/task_retrieval.py: task retrieval logic used at inference time.custom_clip/PromptCross.py: ChordPrompt model definition and prompt/prototype memory.scripts/*.sh: reproduction and ablation scripts.
Install dependencies with:
pip install -r requirements.txtRequired datasets:
ImageNetConceptual_Captions
Target datasets used in the paper:
AircraftCaltech101CIFAR10CIFAR100DTDEuroSATFlowersFoodMNISTOxfordPetStanfordCarsSUN397
By default, the code reads data from D:/dataset/. Override this with --data-location.
Train a single task from scratch:
python -m src.main \
--train-mode prompt \
--trainer CPrompt \
--train-dataset Aircraft \
--eval-datasets Aircraft \
--data-location D:/dataset/ \
--batch-size 64 \
--batch-size-eval 64 \
--num-workers 4 \
--lr 2e-3 \
--iterations 1000 \
--prompt_width 2 \
--prompt_depth_vision 12 \
--prompt_depth_text 12 \
--task-query-mode hybrid \
--task-query-batches 2 \
--task-query-topk 3 \
--task-query-conf-weight 0.2 \
--save ckpt/exp_CPrompt_2_12Continue training on the next task:
python -m src.main \
--train-mode prompt \
--trainer CPrompt \
--train-dataset Caltech101 \
--eval-datasets Caltech101 \
--data-location D:/dataset/ \
--batch-size 64 \
--batch-size-eval 64 \
--num-workers 4 \
--lr 2e-3 \
--iterations 1000 \
--prompt_width 2 \
--prompt_depth_vision 12 \
--prompt_depth_text 12 \
--task-query-mode hybrid \
--task-query-batches 2 \
--task-query-topk 3 \
--task-query-conf-weight 0.2 \
--save ckpt/exp_CPrompt_2_12 \
--load ckpt/exp_CPrompt_2_12/Aircraft.pthFew-shot training example:
python -m src.main \
--train-mode prompt \
--trainer CPrompt \
--train-dataset Aircraft \
--eval-datasets Aircraft \
--few-shot 5 \
--data-location D:/dataset/ \
--batch-size 16 \
--batch-size-eval 16 \
--num-workers 4 \
--lr 2e-3 \
--iterations 500 \
--eval-interval 500 \
--prompt_width 2 \
--prompt_depth_vision 12 \
--prompt_depth_text 12 \
--task-query-mode hybrid \
--save ckpt/exp_CPrompt_fewshot_2_12Evaluate a saved continual-learning run:
python -m src.general_eval \
--eval_names exp_CPrompt_2_12 \
--trainer CPrompt \
--data-location D:/dataset/ \
--batch-size 64 \
--batch-size-eval 64 \
--num-workers 4 \
--prompt_width 2 \
--prompt_depth_vision 12 \
--prompt_depth_text 12 \
--task-query-mode hybrid \
--task-query-batches 2 \
--task-query-topk 3 \
--task-query-conf-weight 0.2Results are saved to results/<eval_names>.npy.
Inference-time prompt retrieval now supports multiple modes:
hybrid: recommended default. Uses a small set of unlabeled test images to build a proxy text query, retrieves top candidates from the text prototype pool, and reranks them with prompt confidence.proxy_text: image-conditioned proxy-text retrieval only.legacy_text: previous text-to-text lookup behavior. Useful only for compatibility or ablation.
For current experiments, use --task-query-mode hybrid.
The main scripts are:
scripts/CPrompt.shscripts/CPrompt-fewshot.shscripts/CPrompt-ablationdepth.shscripts/CPrompt-ablationlength.sh
Example:
bash scripts/CPrompt.sh- New checkpoints save
prototype_validtogether with prompt/prototype pools. - Older checkpoints can still be loaded, but for the cleanest retrieval behavior it is better to re-save or re-train with the current code.
If you find this repository useful, please cite:
@inproceedings{wang2025chordprompt,
title={ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-domain Incremental Learning in CLIP},
author={Wang, Zhiyuan and Chen, Bokui},
booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
pages={147--164},
year={2025},
organization={Springer}
}This project builds on wise-ft and Continual-CLIP. Thanks to the original authors for releasing their code.