Over-optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results
This repository contains material to reproduce the results of the article "Over-optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results" by Christina Nießl, Moritz Herrmann, Chiara Wiedemann, Giuseppe Casalicchio and Anne-Laure Boulesteix (https://wires.onlinelibrary.wiley.com/doi/full/10.1002/widm.1441).
The code has been written and executed using R version 4.0.2 (2020-06-22) (Platform: x86_64-w64-mingw32/x64 (64-bit))
with package versions latex2exp_0.4.0, stringi_1.5.3, RColorBrewer_1.1-2 forcats_0.5.0, scales_1.1.1,
tidyr_1.1.2, ggrepel_0.8.2, ggplot2_3.3.2, gridExtra_2.3, smacof_2.1-1, e1071_1.7-4, colorspace_1.4-1,
plotrix_3.7-8, dplyr_1.0.2, reshape2_1.4.4., shades_1.4.0
The folder R
consists of:
01_generate_rankdata.R
- generates the rank data for 288 combinations of design and analysis options (
Data/rankdata.RData
) and for 774 combinations of design and analysis options (Data/rankdata_datasample.RData
)
02_unfolding_models.R
- generates three unfolding models:
- model 1 representing 288 combinations (
Data/unfolding_model.RData
) - model 2 representing 774 combinations with ibrier as performance measure (
Data/unfolding_model_datasample_ibrier.RData
) - model 3 representing 774 combinations with cindex as performance measure (
Data/unfolding_model_datasample_cindex.RData
)
- model 1 representing 288 combinations (
- generates goodness-of-fit measures and figures for all unfolding models
03_results.R
- generates the figures shown in the results section of this paper
helper_fcts_generate_rankdata.R
- helper functions for
Data/01_generate_rankdata.R
helper_fcts_results.R
- helper functions for
Data/03_results.R
The folder Data
consists of
- the data resulting from the original benchmark experiment by Herrmann et al. (2020):
Data/datasets_overview.csv
,Data/merged-results_na.RData
- the rank data generated by 288 and 774 combinations of design and analysis options:
Data/rankdata.RData
,Data/rankdata_datasample.RData
- the three unfolding models (see above):
Data/unfolding_model.RData
,Data/unfolding_model_datasample_ibrier.RData
,Data/unfolding_model_datasample_cindex.RData
- to reproduce the figures displayed in this paper, run
R/03_results.R
(andR/02_unfolding_models.R
for the goodness-of-fit plots) - to reproduce the whole analysis, run
R/01_generate_rankdata.R
R/02_unfolding_models.R
R/03_results.R