/VaDER_supporting_code

Supporting code for the manuscript "Deep learning for clustering of multivariate clinical patient trajectories with missing values"

Primary LanguageRGNU Lesser General Public License v2.1LGPL-2.1

Supporting code for the manuscript "Deep learning for clustering of multivariate clinical patient trajectories with missing values". The clustering algorithm VaDER (https://github.com/johanndejong/VaDER) is needed to run much of the code in this repository.

The code depends on data from ADNI and PPMI (http://adni.loni.usc.edu/ and https://www.ppmi-info.org/ ), which the license agreement does not allow me to make public here. Hence, I have supplied artificial patient data as input for the following scripts:

  • ADNI_hyperparameter_optimization.r
  • ADNI_optimal_model.r
  • PPMI_hyperparameter_optimization.r
  • PPMI_optimal_model.r

The artifical data has been randomly sampled from the latent Gaussian mixture distribution that we learn as part of training VaDER (https://github.com/johanndejong/VaDER) on the ADNI and PPMI data, and therefore represents the original data very well, also in terms of missing values.

Note that running the *_hyperparameter_optimization.r scripts is very computationally intensive, and recommended only on a cluster. The *_optimal_model.r scripts use output generated by the *_hyperparameter_optimization.r scripts. However, I have commented out the first two sections (parsing the hyperparameter optimization results) and hard-coded the optimal hyperparameter settings, such that it is possible to directly run the *_optimal_model.r scripts without running the hyperparameter optimization first.