/RNN-Time-series-Anomaly-Detection

RNN based Time-series Anomaly detector model implemented in Pytorch.

Primary LanguagePythonApache License 2.0Apache-2.0

RNN-Time-series-Anomaly-Detection

RNN based Time-series Anomaly detector model implemented in Pytorch.

This is an implementation of RNN based time-series anomaly detector, which consists of two-stage strategy of time-series prediction and anomaly score calculation.

Requirements

  • Ubuntu 16.04+ (Errors reported on Windows 10. see issue. Suggesstions are welcomed.)
  • Python 3.5+
  • Pytorch 0.4.0+
  • Numpy
  • Matplotlib
  • Scikit-learn

Dataset

1. NYC taxi passenger count

2. Electrocardiograms (ECGs)

  • The ECG dataset containing a single anomaly corresponding to a pre-ventricular contraction

3. 2D gesture (video surveilance)

  • X Y coordinate of hand gesture in a video

4. Respiration

  • A patients respiration (measured by thorax extension, sampling rate 10Hz)

5. Space shuttle

  • Space Shuttle Marotta Valve time-series

6. Power demand

  • One years power demand at a Dutch research facility

The Time-series 2~6 are provided by E. Keogh et al. in "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence." In The Fifth IEEE International Conference on Data Mining. (2005) , dataset

Implemented Algorithms

Example of usage

0. Download the dataset: Download the five kinds of multivariate time-series dataset (ecg, gesture,power_demand, respiration, space_shuttle), and Label all the abnormality points in the dataset.

    python 0_download_dataset.py

1. Time-series prediction: Train and save RNN based time-series prediction model on a single time-series trainset

    python 1_train_predictor.py --data ecg --filename chfdb_chf14_45590.pkl
    python 1_train_predictor.py --data nyc_taxi --filename nyc_taxi.pkl

Train multiple models using bash script

    ./1_train_predictor_all.sh

2. Anomaly detection: Fit multivariate gaussian distribution and calculate anomaly scores on a single time-series testset

    python 2_anomaly_detection.py --data ecg --filename chfdb_chf14_45590.pkl --prediction_window 10
    python 2_anomaly_detection.py --data nyc_taxi --filename nyc_taxi.pkl --prediction_window 10

Test multiple models using bash script

    ./2_anomaly_detection_all.sh

Result

1. Time-series prediction: Predictions from the stacked RNN model

prediction1

prediction2

2. Anomaly detection:

Anomaly scores from the Multivariate Gaussian Distribution model

equation1

  • NYC taxi passenger count

scores1

  • Electrocardiograms (ECGs) (filename: chfdb_chf14_45590)

scores3

scores4

f1ecg1

f1ecg2

Contact

If you have any questions, please open an issue.