Skip to content

padas-lab-de/agent-action-controller

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Agent Action Controller

Layout

agent-action-controller/
  classification/
    01_csv_dataset_creation.py
    02_csv_dataset_EDA.py
    03_cls_zero_shot.py
    04_cls_train_lora_sft.py
    04b_cls_train_lora_sft_binary.py
    05_cls_train_lora_seqcls.py
    06_binary_ensemble_eval.py
    06_cls_train_probe.py
    07_cls_ml_baseline.py
    08_cls_compare.py
    08_cls_inference_benchmark.py
    09_cls_encoder.py
    cls_utils.py
    generative_utils.py

  e2e_search/
    agent_env.py
    build_corpus.py
    run_agent.py
    run_pipeline.py
    evaluate.py
    compare.py
    compare_controller.py
    compare_fixed_generator.py
    analyze_repeated_e2e.py
    e2e_utils.py
    retriever.py
    openrouter_client.py

Workflow

The classification/ directory contains the trajectory-action classification workflow:

  1. 01_csv_dataset_creation.py: build the CSV dataset from trace files.
  2. 02_csv_dataset_EDA.py: compute dataset statistics and exploratory analysis.
  3. 03_cls_zero_shot.py: run zero-shot generative classification.
  4. 04_cls_train_lora_sft.py: train multiclass LoRA SFT models.
  5. 04b_cls_train_lora_sft_binary.py: train binary LoRA SFT models.
  6. 05_cls_train_lora_seqcls.py: train LoRA sequence-classification models.
  7. 06_binary_ensemble_eval.py: evaluate the binary ensemble setup.
  8. 06_cls_train_probe.py: train probing classifiers.
  9. 07_cls_ml_baseline.py: run classical ML baselines.
  10. 08_cls_compare.py: compare classification runs.
  11. 08_cls_inference_benchmark.py: benchmark inference.
  12. 09_cls_encoder.py: train/evaluate encoder-based classifiers.

The files cls_utils.py and generative_utils.py are local helper modules required by the classification scripts.

The e2e_search/ directory contains the end-to-end search-control evaluation code:

  1. build_corpus.py: construct the retrieval corpus.
  2. run_agent.py: run the local vLLM end-to-end agent.
  3. run_pipeline.py: run the controller/generator pipeline variants.
  4. evaluate.py: compute answer-quality and trajectory metrics.
  5. compare.py: compare base and fine-tuned end-to-end runs.
  6. compare_controller.py: compare controller-ablation runs.
  7. compare_fixed_generator.py: compare fixed-generator controller runs.
  8. analyze_repeated_e2e.py: aggregate repeated runs and compute uncertainty estimates.

The files agent_env.py, e2e_utils.py, retriever.py, and openrouter_client.py are local support modules used by the E2E scripts.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages