/RobustAutoencoder

A combination of Autoencoder and Robust PCA

Primary LanguageJupyter NotebookMIT LicenseMIT

Robust Autoencoder

Robust autoencoder is a model that combines Autoencoder and Robust PCA which can detect both noise and outliers. This repo offers an implementation based on Tensorflow.

Updates

02/12/2018: remove theano implementation.

02/14/2018: clean up codes and put implementation into model/

04/06/2018: Thanks to Tengke-Xiong. delete wrong part on l21shrink.

12/13/2018: Thanks to Roberto. change getRecon function which will accept X instead of L. This change allows Robust Autoencoder can detect anomalies in new data.

03/17/2019: Upgrade to python3 and repeat experiments of outlier detection.

Prerequisities

  • Python 3
  • Numpy3
  • Tensorflow

Shortcut:

  • Denoising Model with l1 regularization on S is at:
    "l1 Robust Autoencoder"
  • Outlier Detection Model with l21 regularization on S.T is at:
    "l21 Robust Autoencoder"
  • Dataset and demo: The outlier detection data is sampled from famous MNIST dataset. The .npk file and .txt file are same, but .npk is only load by python2 numpy. Please file more details at demo:
    "Demo"
  • Repeating Experiments in paper. Please go to "Outlier Detection"
    This folder also contains an l21 robust autoencoder implementation which need different lambdas with the lambdas used by those under model/ folder. These lambdas are chosend exactly the same as the lambda in our paper.
    Please follow these steps:
    python experiment1
    python experiment2
    open ipython notebook and check the results.

Citation

If you find this repo useful and would like to cite it, citing our paper as the following will be really appropriate:

@inproceedings{zhou2017anomaly,
  title={Anomaly detection with robust deep autoencoders},
  author={Zhou, Chong and Paffenroth, Randy C},
  booktitle={Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  pages={665--674},
  year={2017},
  organization={ACM}
}

Reference

[1]Abadi, Martín, et al. "TensorFlow: A System for Large-Scale Machine Learning." OSDI. Vol. 16. 2016.
[2]LeCun, Yann, Corinna Cortes, and C. J. Burges. "MNIST handwritten digit database." AT&T Labs [Online]. Available: MNIST (2010).