This software is for time series analysis with GUI and CLI interfaces. It is a highly modified version of GrammarViz with additional functionality. The GUI enables interactive time series exploration workflow that allows for variable length recurrent and anomalous patterns discovery [4] along with time series classification using Representative Pattern Mining (RPM) using either Euclidean or Dynamic Time Warping (DTW) distance functions [6].
From grammarviz:
It is implemented in Java and is based on continuous signal discretization with SAX, Grammatical Inference with Sequitur and Re-Pair, and algorithmic (Kolmogorov) complexity.
TSAT takes from GrammarViz which also implements the "Rule Density Curve" and "Rare Rule Anomaly (RRA)" algorithms for time series anomaly discovery [5], that significantly outperform HOT-SAX algorithm for time series discord discovery which is current state of the art. In the table below, the algorithms performance is measured in the amount of calls to the distance function (less is better). The last column shows the RRA performance improvement over HOT-SAX :
Dataset and SAX parameters | Dataset size | Brute Force | HOT-SAX | RRA | Reduction |
---|---|---|---|---|---|
Daily commute (350,15,4) | 17,175 | 271,442,101 | 879,067 | 112,405 | 87.2% |
Dutch power demand (750,6,3) | 35,040 | 1.13 * 10^9 | 6,196,356 | 327,950 | 95.7% |
ECG 0606 (120,4,4) | 2,300 | 4,241,541 | 72,390 | 16,717 | 76.9% |
ECG 308 (300,4,4) | 5,400 | 23,044,801 | 327,454 | 14,655 | 95.5% |
ECG 15 (300,4,4) | 15,000 | 207,374,401 | 1,434,665 | 111,348 | 92.2% |
ECG 108 (300,4,4) | 21,600 | 441,021,001 | 6,041,145 | 150,184 | 97.5% |
ECG 300 (300,4,4) | 536,976 | 288 * 10^9 | 101,427,254 | 17,712,845 | 82.6% |
ECG 318 (300,4,4) | 586,086 | 343 * 10^9 | 45,513,790 | 10,000,632 | 78.0% |
Respiration, NPRS 43 (128,5,4) | 4,000 | 14,021,281 | 89,570 | 45,352 | 49.3% |
Respiration, NPRS 44 (128,5,4) | 24,125 | 569,753,031 | 1,146,145 | 257,529 | 77.5% |
Video dataset (150,5,3) | 11,251 | 119,935,353 | 758,456 | 69,910 | 90.8% |
Shuttle telemetry, TEK14 (128,4,4) | 5,000 | 22,510,281 | 691,194 | 48,226 | 93.0% |
Shuttle telemetry, TEK16 (128,4,4) | 5,000 | 22,491,306 | 61,682 | 15,573 | 74.8% |
Shuttle telemetry, TEK17 (128,4,4) | 5,000 | 22,491,306 | 164,225 | 78,211 | 52.4% |
[1] Lin, J., Keogh, E., Wei, L. and Lonardi, S., Experiencing SAX: a Novel Symbolic Representation of Time Series. DMKD Journal, 2007.
[2] Nevill-Manning, C.G., Witten, I.H., Identifying Hierarchical Structure in Sequences: A linear-time algorithm. arXiv:cs/9709102, 1997.
[3] Larsson, N. J., Moffat, A., Offline Dictionary-Based Compression, IEEE 88 (11): 1722–1732, doi:10.1109/5.892708, 2000.
[4] Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S., Lerner, M., GrammarViz 2.0: a tool for grammar-based pattern discovery in time series, ECML/PKDD Conference, 2014.
[5] Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S., Lerner, M., Time series anomaly discovery with grammar-based compression, The International Conference on Extending Database Technology, EDBT 15.
[6] Wang, X., Lin, J., Senin, P., Oates, T., Gandhi, S., Boedihardjo, A., Chen, C., Frankenstein, S. (2016). RPM: Representative Pattern Mining for Efficient Time Series Classification. In EDBT (pp. 185-196).
We use Maven and Java 7 to build an executable.
$ java -version java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) $ mvn -version Apache Maven 2.2.1 (rdebian-8) Java version: 1.7.0_80 Java home: /usr/lib/jvm/java-7-oracle/jre Default locale: fr_FR, platform encoding: UTF-8 OS name: "linux" version: "3.2.0-86-generic" arch: "amd64" Family: "unix" $ mvn package -Psingle [INFO] Scanning for projects... .... [INFO] Building jar: /media/Stock/git/TSAT/target/tsat-0.0.1-SNAPSHOT-jar-with-dependencies.jar [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESSFUL [INFO] ------------------------------------------------------------------------ [INFO] Total time: 5 seconds [INFO] Finished at: Wed Jun 17 15:43:01 CEST 2015 [INFO] Final Memory: 47M/238M [INFO] ------------------------------------------------------------------------
To run the GUI use GrammarVizGUI
class, or run the jar
from the command line: $ java -Xmx2g -jar target/tsat-0.0.1-SNAPSHOT-jar-with-dependencies.jar
(here I have allocated max of 2Gb of memory for the software).
By using CLI as discussed in these tutorials, it is possible to save the inferred grammar, motifs, and discords.