MUFASA: Fast and Accurate Multivariate Time-Series Clustering

Note 1: We adhere to the environment configurations specified in the original GitHub repositories of each method. In cases where dependencies conflict and cannot be resolved within a unified setup, we maintain separate environments for those methods.

Note 2: Both the MTS distance evaluation and MTS clustering experiments use the same datasets — the original UEA and the downsampled UEA versions. Four dataset links are provided (two for distance and two for clustering) for convenience when running the experiments. The distance evaluation part is adapted from an existing GitHub repository. The clustering pipeline follows our own implementation.

Multivariate Time-Series Distances Evaluation

Code is under Distance_Measure folder

1. Setting up the environment:

conda create -n MUFASA_MTS_Dist python=3.10
conda activate MUFASA_MTS_Dist
pip install -r requirements_MTS_traditional.txt

2. Getting the datasets:

The original UEA archive, after preprocessing, can be downloaded from the following link. The downsampled version of the UEA archive (used to evaluate elastic measures) can be downloaded from here.

3. Running classification experiments:

You can run inference on a specific dataset with a specific metric and Z-score normalization through running the Distance_Measure/main.py script, with the following arguments:

-mp - run type (inference)
-d or --data - path to the data directory
-p or --problem - name of the dataset to run classification on (e.g. BasicMotions)
-m or --metric - name of the measure to use (e.g. euclidean)
-n or --norm - name of the normalization method to use (zscore)
-c or --metric_params - additional parameters for the metric, passed as key=value pairs separated by spaces

Example 1: Run inference with Euclidean distance on the BasicMotions dataset with Z-score normalization we would run:

python3 Distance_Measure/main.py -mp inference -d $DATASET_DIR$ -p BasicMotions -m euclidean -n zscore

Example 2: Run inference with DTW-D distance on the BasicMotions dataset with Z-score normalization we would run:

python3 Distance_Measure/main.py -mp inference -d $DATASET_DIR$ -p BasicMotions -m dtw-d -n zscore -c sakoe_chiba_radius=0.1

Multivariate Time-Series Clustering

Code is under Clustering folder

1. Setting up the environment:

Traditional Methods:

conda create -n MUFASA_MTS_Tradi python=3.10
conda activate MUFASA_MTS_Tradi
pip install -r requirements_MTS_traditional.txt

Deep Learning:

conda create -n MUFASA_MTS_DNN python=3.7
conda activate MUFASA_MTS_DNN
pip install -r requirements_MTS_DNN.txt
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 cudatoolkit=11.7 -c pytorch -c nvidia

Foundation Models:

conda create -n MUFASA_MTS_FM python=3.10
conda activate MUFASA_MTS_FM
pip install -r requirements_MTS_FM.txt
conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1 -c pytorch -c nvidia

2. Getting the datasets:

The UEA archive, after preprocessing and Z-score normalization, can be downloaded from the following link. The downsampled version of the UEA archive (used to evaluate non-scalable algorithms) can be downloaded from here.

3. Running multivariate clustering experiments:

You can run multivariate clustering through running the Clustering/Running_baselines_iter.py script, with the following arguments:

-p or --path - path to the dataset directory (e.g. folder containing all UEA dataset subfolders)
-a or --algo - name of the algorithm to use (e.g. MUFASA)
-i or --itr - iteration number
-s or --save_path - path to save results

Example 1: Run MUFASA we would run:

python3 Clustering/Running_baselines_iter.py -p $DATASET_DIR$ -a MUFASA -i 1 -s $SAVE_DIR$

T-GMRF is kept in a separate folder due to integration complexity.

Example 2: Run T-GMRF on the BasicMotions dataset we would run:

cd Clustering/T-GMRF
python3 Run_TGMRF_Combine.py -p $DATASET_DIR$ -i 1 -f BasicMotions

Univariate Time-Series Clustering

Code is under Clustering folder

1. Setting up the environment:

Except KASBA:

conda create -n MUFASA_UTS python=3.10
conda activate MUFASA_UTS
pip install -r requirements_UTS.txt

KASBA:

conda create -n MUFASA_KASBA python=3.10
conda activate MUFASA_KASBA
pip install -r requirements_KASBA.txt

2. Getting the datasets:

The UCR archive, after preprocessing and Z-score normalization, can be downloaded from the following link.

3. Running univariate clustering experiments:

You can run univariate clustering through running the Clustering/Running_baseline_iter_univariate.py script, with the following arguments:

-p or --path - path to the dataset directory (e.g. folder containing all UCR dataset subfolders)
-a or --algo - name of the algorithm to use (e.g. FASA)
-i or --itr - iteration number
-s or --save_path - path to save results

Example 1: Run FASA we would run:

python3 Clustering/Running_baseline_iter_univariate.py -p $DATASET_DIR$ -a FASA -i 1 -s $SAVE_DIR$

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MUFASA: Fast and Accurate Multivariate Time-Series Clustering

Multivariate Time-Series Distances Evaluation

1. Setting up the environment:

2. Getting the datasets:

3. Running classification experiments:

Multivariate Time-Series Clustering

1. Setting up the environment:

2. Getting the datasets:

3. Running multivariate clustering experiments:

Univariate Time-Series Clustering

1. Setting up the environment:

2. Getting the datasets:

3. Running univariate clustering experiments:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
Clustering		Clustering
Distance_Measure		Distance_Measure
.gitignore		.gitignore
MUFASA_appendix.pdf		MUFASA_appendix.pdf
README.md		README.md
requirements_KASBA.txt		requirements_KASBA.txt
requirements_MTS_DNN.txt		requirements_MTS_DNN.txt
requirements_MTS_FM.txt		requirements_MTS_FM.txt
requirements_MTS_traditional.txt		requirements_MTS_traditional.txt
requirements_UTS.txt		requirements_UTS.txt

Folders and files

Latest commit

History

Repository files navigation

MUFASA: Fast and Accurate Multivariate Time-Series Clustering

Multivariate Time-Series Distances Evaluation

1. Setting up the environment:

2. Getting the datasets:

3. Running classification experiments:

Multivariate Time-Series Clustering

1. Setting up the environment:

2. Getting the datasets:

3. Running multivariate clustering experiments:

Univariate Time-Series Clustering

1. Setting up the environment:

2. Getting the datasets:

3. Running univariate clustering experiments:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages