ImputeBench: Benchmark of Imputation Techniques in Time Series

ImputeBench implements 13 recovery techniques for blocks of missing values in time series and evaluates their precision and runtime on various real-world time series datasets using different recovery scenarios. Technical details can be found in our PVLDB 2020 paper: Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series . The benchmark allows to easily integrate new algorithms and datasets.

  • Build all the Testing Framework using the installation script located in the root folder (takes ~1min):
    $ sh


    $ cd TestingFramework/bin/Debug/
    $ mono TestingFramework.exe [arguments]


-alg -d -scen
cdrec airq miss_perc
dynammo bafu ts_length
grouse chlorine ts_nbr
rosl climate miss_disj
softimp drift10 miss_over
svdimp electricity mcar
svt meteo blackout
stmvl temp all
spirit bafu_red
tenmf drift10_red
tkcm all


All results will be added to Results folder. The accuracy results and plots of all algorithms will be sequentially added for each scenario and dataset to: Results/.../.../error/. The runtime results and plots of all algorithms will be added to: Results/.../.../runtime/.

Execution examples

  1. Run the whole benchmark (all algorithms, all datasets, all scenarios, precision and runtime)
    $ mono TestingFramework.exe -alg all -d all -scen all

Warning: Running the whole benchmark will take a sizeable amount of time (up to 4 days depending on the hardware) and will produce up to 15GB of output files with all recovered data and plots unless stopped early.

  1. Run a single algorithm (cdrec) on a single dataset (drift10) using one scenario (missing percentage)
    $ mono TestingFramework.exe -alg cdrec -d drift10 -scen miss_perc
  1. Run two algorithms (spirit, cdrec) on a single dataset (drift10) using one scenario (missing percentage)
    $ mono TestingFramework.exe -alg spirit,cdrec -d drift10 -scen miss_perc
  1. Run point 3 without runtime results
    $ mono TestingFramework.exe -alg spirit,cdrec -d drift10 -scen miss_perc -nort
  1. Additional command-line parameters
    $ mono TestingFramework.exe --help

Remark: Algorithms tkcm, spirit and ssa cannot handle multiple incomplete time series. These two allgorithms will not produce results for scenarios: miss_disj, miss_over, mcar and blackout.

Parametrized execution

  • You can parametrize each algorithm using the command -algx. For example, you can run the svdimp algorithm with a reduction value of 4 on the drift dataset and by varying the sequence length as follows:
    $ mono TestingFramework.exe -algx svdimp 4 -d drift10 -scen ts_nbr
  • If you want to run some algorithms with default parameters, and some with customized ones, you can use -alg and -algx together. For example, you can run stmvl algorithm with default parameter and cdrec algorithm with a reduction value of 4 on the airq dataset and by varying the sequence length as follows:
    $ mono TestingFramework.exe -alg stmvl -algx cdrec 4 -d airq -scen ts_nbr

Remark: The command -algx cannot be executed in group and thus must preceed the name of each algorithm.

Algorithm and Dataset Insertion

  • To add your own algorithm to the benchmark, please refer to this tutorial.
  • To add your own dataset:
    • import the file to TestingFramework/bin/Debug/data/{name}/{name}_normal.txt (name is the name of your data).
    • Requirements: rows>= 1'000, columns>= 10, column separator: empty space, row separator: newline


 author    = {Mourad Khayati and Alberto Lerner and Zakhar Tymchenko and Philippe Cudr{\'{e}}{-}Mauroux},
 title     = {Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series},
 booktitle = {Proceedings of the VLDB Endowment},
 volume    = {13},
 number    = {5},
 year      = {2020}


Imputebench has received the VLDB 2020 Most Reproducible Paper Award.
