/tsml-java

Java time series machine learning tools in a Weka compatible toolkit

Primary LanguageJavaGNU General Public License v3.0GPL-3.0

This repository is not being proactively maintained or receiving new implementation currently (This message updated 26/06/2023). Feel free to open still bug reports, we may get around to fixing them eventually but response may be delayed. For the latest time series algorithms implemented and maintained by our group, see the Python based aeon toolkit.

UEA Time Series Classification

https://travis-ci.com/uea-machine-learning/tsml.svg?branch=master

A Weka-compatible Java toolbox for time series classification, clustering and transformation. For the python sklearn-compatible version, see aeon.

Find out more info about our broader work and dataset hosting for the UCR univariate and UEA multivariate time series classification archives on our website.

This codebase is actively being developed for our research. The dev branch will contain the most up-to-date, but stable, code.

Installation

We are looking into deploying this project on Maven or Gradle in the future. For now there are two options:

  • download the jar file and include as a dependency in your project, or you can run experiments through command line, see the examples on running experiments
  • fork or download the source files and include in a project in your favourite IDE you can then construct your own experiments (see our examples) and implement your own classifiers.

Overview

This codebase mainly represents the implementation of different algorithms in a common framework, which at the time leading up to the Great Time Series Classification Bake Off in particular was a real problem, with implementations being in any of Python, C/C++, Matlab, R, Java, etc. or even combinations thereof.

We therefore mainly provide implementations of different classifiers as well as experimental and results analysis pipelines with the hope of promoting and streamlining open source, easily comparable, and easily reproducible results, specifically within the TSC space.

While they are obviously very important methods to study, we shall very likely not be implementing any kind of deep learning methods in our codebase, and leave those rightfully in the land of optimised languages and libraries for them. See aeon for implemented deep learning methods for time series data.

Our examples run through the basics of using the code, however the basic layout of the codebase is this:

evaluation/
contains classes for generating, storing and analysing the results of your experiments
experiments/
contains classes specifying the experimental pipelines we utilise, and lists of classifier and dataset specifications. The 'main' class is Experiments.java, however other experiments classes exist for running on simulation datasets or for generating transforms of time series for later classification, such as with the Shapelet Transform.
tsml/ and multivariate_timeseriesweka/
contain the TSC algorithms we have implemented, for univariate and multivariate classification respectively.
machine_learning/
contains extra algorithm implementations that are not specific to TSC, such as generalised ensembles or classifier tuners.

Implemented Algorithms

Classifiers

The lists of implemented TSC algorithms shall continue to grow over time. These are all in addition to the standard Weka classifiers and non-TSC algorithms defined under the machine_learning package.

We have implemented the following bespoke classifiers for univariate, equal length time series classification:

Distance Based Dictionary Based Kernel Based Shapelet Based Interval Based Hybrids
DD_DTW BOSS Arsenal LearnShapelets TSF HIVE-COTE
DTD_C cBOSS ROCKET ShapeletTransform TSBF Catch22
ElasticEnsemble TDE   FastShapelets LPS  
NN_CID WEASEL   ShapeletTree CIF  
SAX_1NN SAXVSM     DrCIF  
ProximityForest SpatialBOSS     RISE  
DTW_kNN SAX_1NN     STSF  
FastDTW BafOfPatterns...        
FastElasticEn... BOSSC45        
ShapeDTW_1NN BoTSWEnsemble        
ShapeDTW_SVM BOSSSpatialPy...        
SlowDTW_1NN          
KNN          

And we have implemented the following bespoke classifiers for multivariate, equal length time series classification:

NN_ED_D MultivariateShapeletTransform
NN_ED_I ConcatenateClassifier
NN_DTW_D MultivariateHiveCote
NN_DTW_I WEASEL+MUSE
STC_D MultivariateSingleEnsemble
NN_DTW_A MultivariateAbstractClassifier
MultivariateAbstractEnsemble

Clusterers

Currently quite limited, aside from those already shipped with Weka.

UnsupervisedShapelets  
K-Shape  
DictClusterer  
TTC  
AbstractTimeSeriesCLusterer  

Filters

SimpleBatchFilters that take an Instances (the set of time series), transforms them and returns a new Instances object.

ACF ACF_PACF ARMA
BagOfPatternsFilter BinaryTransform Clipping
Correlation Cosine DerivativeFilter
Differences FFT Hilbert
MatrixProfile NormalizeAttribute NormalizeCase
PAA PACF PowerCepstrum
PowerSepstrum RankOrder RunLength
SAX Sine SummaryStats

Transformers We will be shifting over to a bespoke Transformer interface

ShapeletTransform  
catch22  

Paper-Supporting Branches

This project acts as the general open-source codebase for our research, especially the Great Time Series Classification Bake Off. We are also trialling a process of creating stable branches in support of specific outputs.

Current branches of this type are:

Contributors

Lead: Anthony Bagnall (@TonyBagnall, @tony_bagnall, ajb@uea.ac.uk)

We welcome anyone who would like to contribute their algorithms!

License

GNU General Public License v3.0