/tslearn

A machine learning toolkit dedicated to time-series data

Primary LanguagePythonBSD 2-Clause "Simplified" LicenseBSD-2-Clause

PyPI version Documentation Status Build Status Code Climate Test Coverage

tslearn is a Python package that provides machine learning tools for the analysis of time series. This package builds on scikit-learn, numpy and scipy libraries.

If you would like to contribute to tslearn, please have a look at our contribution guidelines.

Dependencies

Cython
numpy
scipy
scikit-learn

If you plan to use the shapelets module, keras and tensorflow should also be installed.

Installation

Using conda

The easiest way to install tslearn is probably via conda:

conda install -c conda-forge tslearn

Using PyPI

Using pip should also work fine:

pip install tslearn

Using latest github-hosted version

If you want to get tslearn's latest version, you can refer to the repository hosted at github:

pip install git+https://github.com/rtavenar/tslearn.git

Troubleshooting

It seems on some platforms Cython dependency does not install properly. If you experiment such an issue, try installing it with the following command:

pip install cython

or (depending on your preferred python package manager):

conda install -c anaconda cython

before you start installing tslearn.

Documentation and API reference

The documentation, including a gallery of examples, is hosted at readthedocs.

Already available

  • A generators module provides Random Walks generators
  • A datasets module provides access to the famous UCR/UEA datasets through the UCR_UEA_datasets class
  • A preprocessing module provides standard time series scalers
  • A metrics module provides:
    • Dynamic Time Warping (DTW) (with Sakoe-Chiba band and Itakura parallelogram variants)
    • LB_Keogh
    • Global Alignment Kernel
    • Soft-DTW from Cuturi and Blondel
  • A neighbors module includes nearest neighbor algorithms to be used with time series
  • An svm module includes Support Vector Machine algorithms with:
    • Standard kernels offered in sklearn (with adequate array reshaping done for you)
    • Global Alignment Kernel
  • A clustering module includes the following time series clustering algorithms:
    • Standard Euclidean k-means (with adequate array reshaping done for you)
      • Based on tslearn.barycenters
    • DBA k-means from Petitjean et al.
      • Based on tslearn.barycenters that offers DBA facility that could be used for other applications than just k-means
    • Global Alignment kernel k-means
    • KShape clustering from Paparizzos and Gravano
    • Soft-DTW k-means from Cuturi and Blondel
      • Based on tslearn.barycenters that offers Soft-DTW barycenter computation
    • It also provides a way to compute the silhouette coefficient for given clustering and metric
  • A shapelets module includes an efficient implementation of the Learning Time-Series method from Grabocka et al.
    • Warning: to use the shapelets module, two extra dependencies are required: keras and tensorflow
  • A piecewise module includes standard time series transformations, as well as the corresponding distances:
    • Piecewise Aggregate Approximation (PAA)
    • Symbolic Aggregate approXimation (SAX)
    • 1d-Symbolic Aggregate approXimation (1d-SAX)

TODO list

Have a look there for a list of suggested features. If you want other ML methods for time series to be added to this TODO list, do not hesitate to open an issue! See our contribution guidelines for more information about how to proceed.

Acknowledgments

Authors would like to thank Mathieu Blondel for providing code for Kernel k-means and Soft-DTW (both distributed under BSD license) that are used in the clustering and metrics modules of this library.

Referencing tslearn

If you use tslearn in a scientific publication, we would appreciate citations:

@misc{tslearn,
 title={tslearn: A machine learning toolkit dedicated to time-series data},
 author={Tavenard, Romain},
 year={2017},
 note={\url{https://github.com/rtavenar/tslearn}}
}