C2-ML-Detection-Framework

A modular, ML-powered framework designed to detect command-and-control (C2) malware using network traffic data. Developed as part of a research project targeting the detection of malware through sequence-aware deep learning models (e.g., LSTM).

This framework is focused on experimentation, extensibility, and reproducibility — with full MLflow support and Python environment management via Poetry.

Directory Structure

malware-model-training/
│
├── data/
│   ├── raw/
│       ├── csv/
│           └── [raw data files for malware in csv]
│       ├── pcap/
│           └── [raw data files for malware in pcap]
│   ├── processed/
│       ├── malware_1.csv
│       ├── malware_2.csv
│   └── labelled/
│       ├── malware_1.csv
│       ├── malware_2.csv
│
├── models/
│   ├── malware_1/
│       └── [trained models for malware_1]
│   ├── malware_2/
│       └── [trained models for malware_2]
│
├── notebooks/
│   ├── data_processing/
│       ├── malware_1.ipynb
│       ├── malware_2.ipynb
│   ├── modeling/
│       ├── malware_1.ipynb
│       ├── malware_2.ipynb
│   ├── data_labelling/
│       ├── malware_1.ipynb
│       ├── malware_2.ipynb
│   ├── data_parsing/
│       ├── malware_1.py
│       ├── malware_2.py
│
├── variables/
│       ├── malware_1/
│           └── scaler.pkl
│       ├── malware_2/
│           └── scaler.pkl
│
└── [other project files, e.g., README.md, requirements.txt, etc.]

Features

End-to-end ML pipeline for malware traffic detection
Reproducible experiments with MLflow
Dependency isolation using Poetry
Real-world datasets including Dridex and Emotet
Deep learning models (LSTM-based) for temporal pattern recognition

Setup

Clone the Repository

git clone https://github.com/Yousinator/C2-ML-Detection-Framework.git
cd C2-ML-Detection-Framework

Install Poetry (if you haven’t)

curl -sSL https://install.python-poetry.org | python3 -

Install Dependencies
```
poetry install
```
Activate the Virtual Environment
```
poetry shell
```

Notebooks

All experimentation is done through notebooks inside the notebooks/ directory. Each notebook is self-contained and includes:

Data Parsing
Data loading and preprocessing
Feature engineering
Model training and evaluation

MLflow artifacts and metrics will be logged automatically to the mlruns/ folder.

Datasets

The framework supports labeled datasets for C2 malware such as:

Dridex C2 traffic
Emotet C2 traffic

Data is under the data/ directory. Structure and preprocessing steps are detailed in the relevant Jupyter notebooks under notebooks/.

Model Overview

Core model: LSTM-based malware traffic classifier
Input features: Sequence of flow-level and packet-level statistics
Output: Binary label (malicious / benign)

Citation

If you use this framework in your research or project, please consider citing:

@misc{musabeh2025c2ml,
  author       = {Yousef Musabeh},
  title        = {A Machine Learning Framework for Detecting Command-and-Control Malware via Network Behavior},
  year         = {2025},
  url          = {https://github.com/Yousinator/C2-ML-Detection-Framework}
}

License

This project is licensed under the MIT License. See the LICENSE file for more details. Let me know if you want sections for Contributing, Environment Variables, or more advanced usage examples!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
mlruns/0		mlruns/0
models		models
notebooks		notebooks
variables/dridex		variables/dridex
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

C2-ML-Detection-Framework

Directory Structure

Features

Setup

Notebooks

Datasets

Model Overview

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

C2-ML-Detection-Framework

Directory Structure

Features

Setup

Notebooks

Datasets

Model Overview

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages