Before being able to run the analysis pipeline, you must download the two datasets: cardoen-2024.csv and koishybayev-2022.csv. These datasets can be found in this Figshare repository, inside the replication.zip > data > datasets directory. Afterwards, place the downloaded datasets directory in the data disectory of this repository. The file tree should look like this:
under-pressure
├── data
│ ├── datasets
│ │ ├── cardoen-2024.csv
│ │ └── koishybayev-2022.csv
│ └── vulnerable-actions.json
└── ...
Another requirement is the Soteria executable. This will be the tool that will detect and extract the misconfigurations from the YAML workflows. The executable is available here, just choose the one that matches your platform and architecture, and download it. Once downloaded, rename it to soteria and place it in the root of this repository.
To run the analysis scripts, you will first need to create the conda environment. This can be done by using the provided environment.yml file. The creation of the environment can be done by running the following command:
conda env create -f environment.yml
conda activate under-pressureFinally, we are ready to run the scripts. To perform the analysis on the data present in the data/repo directory, you will need to run the following line of code:
python -m src.mainTo change the dataset passed to the analysis pipeline, you must change the name of the dataset inside src > main.py. Simply uncomment the line of the dataset you want to analyze.
11 if __name__ == '__main__':
12 dataset: str = 'cardoen-2024'
13 # dataset: str = 'koishybayev-2022'After having run the pipeline, you will be able to use the .ipynb files placed in the notebooks directory.
If a popup appears (on MacOS) saying something along the lines of "'soteria' can't be opened because the identity of the developer cannot be confirmed," then head to System Settings > Privacy and Security > scroll down > Open Anyway. Now, when running the script again, you will be prompted to run the executable, this time the "Open" option will be available, click it and everything should work fine.
This repository serves as the replication package for the following publication:
Riggio, E. and Pautasso C. (2025). Pipelines Under Pressure: An Empirical Study of Security Misconfigurations of GitHub Workflows. Proceedings of 26th International Conference on Product-Focused Software Process Improvement (PROFES), pp. 220-236, Springer, doi: 10.1007/978-3-032-12089-2_14
- Edoardo Riggio - https://edoriggio.com