This repository is part of the CS266 - Topics in Information Security course project titled "Enhanced Anomaly Detection in Keystroke Dynamics Authentication". The project was submitted as part of the course requirements by the team members:
- Rashmi Sonth
- Sakshi Sanskruti Tripathy
The project focuses on analyzing different methodologies for anomaly detection in keystroke dynamics authentication, leveraging advanced machine learning and deep learning models.
-
demographics.csv:- Contains demographic information of participants.
- Used to supplement analysis with demographic-based features.
-
free-text.csv:- Contains raw keystroke dynamics data.
- Includes timing features such as
DU.key1.key1,DD.key1.key2, etc. - This dataset is critical for building and testing machine learning models.
-
Dataset Source:
- The dataset used in this project can be downloaded from Zenodo.
models.ipynb:- The main Jupyter Notebook for training and evaluating models.
- Implements machine learning and deep learning approaches (e.g., LSTM, CNN).
- Important: Ensure the dataset paths (
free-text.csvanddemographics.csv) are correctly updated in this notebook based on your local environment.
output.png:- Helps to evaluate the effectiveness of implemented techniques.
-
Update Dataset Paths:
- The datasets are located in the
datasetfolder. - Ensure you update the dataset paths in the
models.ipynbfile wherever necessary:data = pd.read_csv('dataset/free-text.csv') demographics = pd.read_csv('dataset/demographics.csv')
- The datasets are located in the
-
Run the Notebook:
- Open
models.ipynbin Jupyter Notebook or any compatible environment. - Execute the cells step-by-step to train and evaluate the models.
- Open
-
Python 3.7+
-
Libraries:
tensorflownumpypandasmatplotlibseabornsklearn
- Install dependencies:
python -m pip install -r requirements.txt
- Run the main script (uses
dataset/free-text.csvanddataset/demographics.csv):
python models.py
Note: the script will automatically fall back to the local
dataset/folder when not running in Google Colab.
- Create and activate an environment with conda:
conda create -n keystroke python=3.11 -y conda activate keystroke
- Install binary packages from conda-forge:
conda install -c conda-forge pandas numpy scipy scikit-learn matplotlib seaborn -y
- Run the script:
python models.py
- Create the venv (PowerShell):
python -m venv .venv- Use the venv's Python directly (no activation required):
.venv\bin\python.exe -m pip install --upgrade pip setuptools wheel .venv\bin\python.exe -m pip install -r requirements.txt .venv\bin\python.exe models.py
Note: on Windows some venvs use
Scriptsinstead ofbin(check.venvfor eitherbinorScripts).-
SSL / certificate verify failures while pip installing: This environment shows
ssl.SSLCertVerificationErrorwhen pip attempts to download build backends (cmake, ninja). Common fixes:- Use the Conda approach above (conda provides prebuilt binaries). This is recommended on Windows.
- Ensure system time is correct and corporate proxies are configured. If your network intercepts TLS, install the organization's CA into the OS certificate store.
- As a last resort, set the environment variable
PIP_DISABLE_PIP_VERSION_CHECK=1and use--trusted-hostfor pip (not recommended for long-term use).
-
Build errors for numpy/pandas: Building
numpyandpandasfrom source on Windows often requires a C/C++ toolchain (MSVC) and CMake. Use conda to avoid this.