A Python implementation of Shape-based Dynamic Time Warping (ShapeDTW) for time series analysis, featuring multiple shape descriptors and evaluation frameworks.
PyShapeDTW provides advanced time series similarity computation through shape-based features. It extends traditional Dynamic Time Warping (DTW) by incorporating local shape information, making it more robust to amplitude variations and noise while preserving important shape characteristics.
A demonstration notebook is available in pyshapeDTW/demo/demo.ipynb
conda create -n shapeDTW
conda activate shapeDTW
pip install -e ".[all]"To run tests :
pytestThe project is organized into several key components:
-
Elastic measures: The fundamental distance computation algorithms
shape_dtw.py: Main ShapeDTW implementationderivative_dtw.py: DTW variant using derivative informationbase_dtw.py: Base DTW functionality
-
Shape fescriptors: Different ways to capture local shape information
PAA: Piecewise Aggregate ApproximationWavelets: Discrete Wavelet Transform descriptorsSlope: Linear slope computed over local segments
- Classification pipeline: Evaluates the performance of different DTW variants on time series classification tasks
- Alignment pipeline: Assesses the quality of sequence alignments produced by different methods
- Multiple Shape descriptors: Choose from various shape description methods
- Flexible evaluation: Comprehensive frameworks for both classification and alignment tasks
- Easy integration: Clean API design for incorporating new descriptors or evaluation metrics
- UCR dataset support: Built-in support for the UCR time series dataset collection
The package provides a Typer-based CLI with two main commands:
# Run classification evaluation
python -m pyshapeDTW ucr-classification
# Run alignment evaluation
python -m pyshapeDTW ucr-alignmentEvaluates how well different DTW variants can classify time series by comparing their performance on standard benchmark datasets. The pipeline:
- Loads time series data
- Computes shape descriptors (if using ShapeDTW)
- Performs nearest-neighbor classification
- Evaluates accuracy metrics
Can be run for any UCR dataset.
Assesses the quality of sequence alignments by:
- Generating synthetic warped sequences from the original data
- Attempting to recover the original warping
- Comparing the recovered alignment with ground truth
- Visualizing the results
Can be run for any UCR dataset.
Detailed documentation is available in the docstrings. Each module is extensively documented with:
- Function and class descriptions
- Parameter specifications
- Usage examples
- Return value descriptions
The UCR Time Series Classification Archive is a large collection of time series datasets widely used for benchmarking. Due to computational constraints, we've carefully curated two dataset lists:
todo_datasets.csv: A focused selection of the most computationally manageable datasetstodo_datasets_extended.csv: An expanded list that includes additional datasets while maintaining reasonable computation times
This filtering approach retains approximately half of the original UCR archive, ensuring comprehensive evaluation while keeping computational requirements practical.