/voraus-ad-dataset

The code for the voraus-AD dataset paper.

Primary LanguagePythonMIT LicenseMIT

voraus-AD Dataset

This is the official repository to the paper "The voraus-AD Dataset for Anomaly Detection in Robot Applications" by Jan Thieß Brockmann, Marco Rudolph, Bodo Rosenhahn, and Bastian Wandt which is accepted to IEEE Transactions on Robotics and will be officially published soon.

We introduce the voraus-AD dataset, a novel dataset for anomaly detection in robotic applications as well as an unsupervised method MVT-Flow which finds anomalies on time series of robotic machine data without having some of them in the training set.

Download the Dataset 100 Hz
(~1,1 GB Disk / ~2.5 GB RAM) - used in this repository

Download the Dataset 500 Hz
(~5.3 GB Disk / ~12.5 GB RAM)

Please note: The datasets in both the 100 Hz and 500 Hz variants are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Getting Started

You will need Python 3.9 and the packages specified in requirements.txt. We recommend setting up a virtual environment with pip and installing the packages there.

Install packages with:

python3.9 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Configure and Run

Set the variable DATASET_PATH in train.py to the path of the downloaded dataset file. The variable configuration contains the training configuration as well as the hyperparameters of the model. The paper describes all the configuration parameters in detail. Make also sure to execute the tests before training. The test test_train may take a few minutes depending on your setup.

pytest

The train.py is entrypoint to this repository, it contains the configuration, training and validation steps for our model. The default configuration will run a training with paper-given parameters on the provided voraus-AD dataset (@100 Hz). To start the training, just run train.py!

python train.py

If training on the voraus-AD data does not lead to an AUROC greater 0.9, something seems to be wrong. Don't be worried if the loss is negative. The loss reflects the negative log likelihood which may be negative. Please report us if you have issues when using the code.

Devlopment

We are using the following tools during development:

  • isort for import sorting
  • black for code formatting
  • mypy for static typing
  • pylint for static code analysis (linting)
  • pydocstyle for Docstring style checking
  • pytest for (unit) testing
  • tox for test automation

Before commiting make sure to format your code with:

isort .
black .

And execute all checks using the following command:

tox

Note: Running tox the first time takes a few minutes since tox creates new virtual environments for linting and testing. The following tox executions are much faster.

Credits

Some code of the FrEIA framework was used for the implementation of Normalizing Flows. Follow their tutorial if you need more documentation about it.

Citation

Please cite our paper in your publications if it helps your research.

@article { BroRud2023,
  author = {Jan Thie{\"s} Brockmann and Marco Rudolph and Bodo Rosenhahn and Bastian Wandt},
  title = {The voraus-AD Dataset for Anomaly Detection in Robot Applications},
  journal = {Transactions on Robotics},
  year = {2023},
  month = nov
}

License Notices

The content of this repository is licensed under the MIT License.
The datasets are licensed under the CC BY-NC-SA 4.0 License.