ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios

Active learning is designed to minimize annotation efforts by prioritizing instances that most enhance learning. However, many active learning strategies struggle with a 'cold start' problem, needing substantial initial data to be effective. This limitation often reduces their utility for pre-trained models, which already perform well in few-shot scenarios. To address this, we introduce ActiveLLM, a novel active learning approach that leverages large language models such as GPT-4, Llama 3, and Mistral Large for selecting instances.

We demonstrate that ActiveLLM significantly enhances the classification performance of BERT classifiers in few-shot scenarios, outperforming both traditional active learning methods and the few-shot learning method SetFit. Additionally, ActiveLLM can be extended to non-few-shot scenarios, allowing for iterative selections. In this way, ActiveLLM can even help other active learning strategies to overcome their cold start problem. Our results suggest that ActiveLLM offers a promising solution for improving model performance across various learning setups.

Contact and Support

If you have any questions, need access to datasets or the complete research data, or if you encounter any bugs, please feel free to contact me!

Repository Overview

This repository contains the experimental code and results (including prompts and answers) used for the ActiveLLM experiments as described in the paper by Markus Bayer and Christian Reuter.

Notice: Since the paper is only pre-printed, the code is not fully optimized, where some manual steps have to be performed, and the scripts may contain some bugs.

Directory Structure

/prompts/ - Directory where generated prompts are saved.
/logs/run/ - Directory where experiment results are stored.

Scripts

prompt_generation.py
- A script for generating the prompt for the LLM.
model_training.py
- A script for training a BERT model based on the given answers of the LLM.
automatic_run.py
- A script to run model_training.py multiple times with different run parameters.

Scripts Details

prompt_generation.py

This script is used to generate prompts that will be fed into the large language models (LLMs) for instance selection. - Includes the following parameters in the script: - MODE = "10" - RUN = "1" - TASK = "cti" - EXAMPLES = 25 - CONTINOUS = "IDXRECAP" # Options: False, NORECAP, RECAP, IDXRECAP

- Modes:
  - Mode 1: CoT - it reiterates the advice
  - Mode 2: No CoT
  - Mode 3: No CoT but tasked to explain each instance
  - Mode 4: No advice + but CoT
  - Mode 5: Mode 4 and Mode 2
  - Mode 6: Mode 4 and Mode 3
  - Mode 7: Best one (Mode 4) with Guidelines
  - Mode 8: Mode 4 with 50 instances
  - Mode 9: Mode 4 with 100 instances
  - Mode 10: Mode 4 with 200 instances
  - Mode 11: Mode 4 with 400 instances

Usage:

python prompt_generation.py

Results:

The results have to be retrieved from the chat models and saved in answers/<dataset_name>/<chat_model_name>/answer_<#run>.txt and the manually extracted label list in /<dataset_name>/<chat_model_name>/list_<#run>.txt - see the existing files for reference

model_training.py

This script trains a BERT model based on the instances selected by the LLM. The model is trained in a few-shot learning setup to evaluate the performance enhancement brought by ActiveLLM.

Usage:

python model_training.py --task <task_name> --mode <mode_name> --run <run_number> --modelclass <model_class> --activelearning <active_learning_strategy> --warmstart <warm_start_option>

automatic_run.py

This script automates the execution of model_training.py with various parameters to facilitate extensive experimentation.

Content:

# run model_training.py multiple times with different run parameters
import subprocess

TASK = ["sst2"] 
MODES = ["10"]
RUNS = ["1"]
MODELCLASS = "default"
ACTIVELEARNING = ["None"]
WARMSTART = "False"

for task in TASK:
    for mode in MODES:
        for activelearning in ACTIVELEARNING:
            for run in RUNS:
                subprocess.call(["python", "model_training.py", "--task", task, "--mode", mode, "--run", run, "--modelclass", MODELCLASS, "--activelearning", activelearning, "--warmstart", WARMSTART])

Citation

If you use this code in your research, please cite our paper:

@misc{bayer2024activellmlargelanguagemodelbased,
      title={ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios}, 
      author={Markus Bayer and Christian Reuter},
      year={2024},
      eprint={2405.10808},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2405.10808}, 
}

Contact

For any questions or issues, please contact Markus Bayer.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
answers		answers
data		data
logs/run		logs/run
prompts		prompts
.gitignore		.gitignore
README.md		README.md
automatic_run.py		automatic_run.py
calc_avg.py		calc_avg.py
model_training.py		model_training.py
modelclass.py		modelclass.py
prompt_generation.py		prompt_generation.py
setfit_modelclass.py		setfit_modelclass.py
small_text_modelclass.py		small_text_modelclass.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios

Contact and Support

Repository Overview

Directory Structure

Scripts

Scripts Details

prompt_generation.py

model_training.py

automatic_run.py

Citation

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

PEASEC/ActiveLLM

Folders and files

Latest commit

History

Repository files navigation

ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios

Contact and Support

Repository Overview

Directory Structure

Scripts

Scripts Details

prompt_generation.py

model_training.py

automatic_run.py

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages