RecovDB: Recovery of missing values inside MonetDB

Prerequisities and dependencies

MonetDB (macOS)

Installation

$ brew install monetdb
$ pip3 install numpy
$ git clone https://github.com/eXascaleInfolab/2018-RecovDB.git recovdb
$ cd recovdb/
$ sh createdb.sh

Python configuration

export PYTHONPATH="${PYTHONPATH}:'HOME'/anaconda2/lib/python2.7/site-packages/"

  • Execute and restart:
$ source .profile (or source bash_profile)
$ sudo shutdown -r now

MonetDB (Ubuntu/Debian)

$ git clone https://github.com/eXascaleInfolab/2018-RecovDB.git recovdb
$ cd recovdb/
$ sh monetdb_install.sh
$ sh createdb.sh

Execution

Recovery of missing values in time series data

We show how to recover overlapping missing blocks in two climate time series located in recovery/input/original.txt

$ sh connectdb.sh
sql> \<./recov_udf.sql
sql> \q

Centroid Decomposition of time series data

We show how to decompose a matrix of time series located in decomposition/input/climate.csv

$ sh connectdb.sh
sql> \<./decomp_udf.sql
sql> \q

Datasets customization

To add a dataset to the recovery:

  • Name your file original.txt and add it to recovery/input/
  • Requirements: columns= 4, column separator: empty space, row separator: newline

To add a dataset to the decomposition :

  • Name your file climate.csv and add it to decomposition/input/
  • Requirements: column separator: empty space, row separator: newline

Graphical RecovDB

RecovDB is also avilable as a GUI here.


Citation

Please cite the following paper when using RecovDB:

@inproceedings{arous2019recovdb,
  title={RecovDB: Accurate and Efficient Missing Blocks Recovery for Large Time Series},
  author={Arous, Ines and Khayati, Mourad and Cudr{\'e}-Mauroux, Philippe and Zhang, Ying and Kersten, Martin and Stalinlov, Svetlin},
  booktitle={2019 IEEE 35th International Conference on Data Engineering (ICDE)},
  pages={1976--1979},
  year={2019},
  organization={IEEE}
}