Skip to content

thedatumorg/MUFASA

Repository files navigation

MUFASA: Fast and Accurate Multivariate Time-Series Clustering

Note 1: We adhere to the environment configurations specified in the original GitHub repositories of each method. In cases where dependencies conflict and cannot be resolved within a unified setup, we maintain separate environments for those methods.

Note 2: Both the MTS distance evaluation and MTS clustering experiments use the same datasets — the original UEA and the downsampled UEA versions. Four dataset links are provided (two for distance and two for clustering) for convenience when running the experiments. The distance evaluation part is adapted from an existing GitHub repository. The clustering pipeline follows our own implementation.

Multivariate Time-Series Distances Evaluation

Code is under Distance_Measure folder

1. Setting up the environment:

conda create -n MUFASA_MTS_Dist python=3.10
conda activate MUFASA_MTS_Dist
pip install -r requirements_MTS_traditional.txt

2. Getting the datasets:

The original UEA archive, after preprocessing, can be downloaded from the following link. The downsampled version of the UEA archive (used to evaluate elastic measures) can be downloaded from here.

3. Running classification experiments:

You can run inference on a specific dataset with a specific metric and Z-score normalization through running the Distance_Measure/main.py script, with the following arguments:

  • -mp - run type (inference)
  • -d or --data - path to the data directory
  • -p or --problem - name of the dataset to run classification on (e.g. BasicMotions)
  • -m or --metric - name of the measure to use (e.g. euclidean)
  • -n or --norm - name of the normalization method to use (zscore)
  • -c or --metric_params - additional parameters for the metric, passed as key=value pairs separated by spaces

Example 1: Run inference with Euclidean distance on the BasicMotions dataset with Z-score normalization we would run:

python3 Distance_Measure/main.py -mp inference -d $DATASET_DIR$ -p BasicMotions -m euclidean -n zscore

Example 2: Run inference with DTW-D distance on the BasicMotions dataset with Z-score normalization we would run:

python3 Distance_Measure/main.py -mp inference -d $DATASET_DIR$ -p BasicMotions -m dtw-d -n zscore -c sakoe_chiba_radius=0.1

Multivariate Time-Series Clustering

Code is under Clustering folder

1. Setting up the environment:

Traditional Methods:

conda create -n MUFASA_MTS_Tradi python=3.10
conda activate MUFASA_MTS_Tradi
pip install -r requirements_MTS_traditional.txt

Deep Learning:

conda create -n MUFASA_MTS_DNN python=3.7
conda activate MUFASA_MTS_DNN
pip install -r requirements_MTS_DNN.txt
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 cudatoolkit=11.7 -c pytorch -c nvidia

Foundation Models:

conda create -n MUFASA_MTS_FM python=3.10
conda activate MUFASA_MTS_FM
pip install -r requirements_MTS_FM.txt
conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1 -c pytorch -c nvidia

2. Getting the datasets:

The UEA archive, after preprocessing and Z-score normalization, can be downloaded from the following link. The downsampled version of the UEA archive (used to evaluate non-scalable algorithms) can be downloaded from here.

3. Running multivariate clustering experiments:

You can run multivariate clustering through running the Clustering/Running_baselines_iter.py script, with the following arguments:

  • -p or --path - path to the dataset directory (e.g. folder containing all UEA dataset subfolders)
  • -a or --algo - name of the algorithm to use (e.g. MUFASA)
  • -i or --itr - iteration number
  • -s or --save_path - path to save results

Example 1: Run MUFASA we would run:

python3 Clustering/Running_baselines_iter.py -p $DATASET_DIR$ -a MUFASA -i 1 -s $SAVE_DIR$

T-GMRF is kept in a separate folder due to integration complexity.

Example 2: Run T-GMRF on the BasicMotions dataset we would run:

cd Clustering/T-GMRF
python3 Run_TGMRF_Combine.py -p $DATASET_DIR$ -i 1 -f BasicMotions

Univariate Time-Series Clustering

Code is under Clustering folder

1. Setting up the environment:

Except KASBA:

conda create -n MUFASA_UTS python=3.10
conda activate MUFASA_UTS
pip install -r requirements_UTS.txt

KASBA:

conda create -n MUFASA_KASBA python=3.10
conda activate MUFASA_KASBA
pip install -r requirements_KASBA.txt

2. Getting the datasets:

The UCR archive, after preprocessing and Z-score normalization, can be downloaded from the following link.

3. Running univariate clustering experiments:

You can run univariate clustering through running the Clustering/Running_baseline_iter_univariate.py script, with the following arguments:

  • -p or --path - path to the dataset directory (e.g. folder containing all UCR dataset subfolders)
  • -a or --algo - name of the algorithm to use (e.g. FASA)
  • -i or --itr - iteration number
  • -s or --save_path - path to save results

Example 1: Run FASA we would run:

python3 Clustering/Running_baseline_iter_univariate.py -p $DATASET_DIR$ -a FASA -i 1 -s $SAVE_DIR$

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages