- Aim: develop a Multi-view deconfounding VAE (multi-view data integration + confounder correction)
- Data:
- Rotterdam study
- 500 individuals
- Cardiovascular diseases
- Methylation
- 3D facial images
- Toy data (TCGA)
- 2547 patients
- 6 cancers
- 2000 most variable mRNAs
- 2000 most variable DNAm
- Rotterdam study
- Conducted by Sonja Katz and Zuqi Li
- Supervised by Prof. Kristel Van Steen, Dr. Gennady Roshchupkin and Prof. Vitor Martins Dos Santos
- Google folder: https://drive.google.com/drive/folders/1GwZbMpVWW4xqdxmw_JRq9-DAR0WlnkE4
## cd Multi-view-Deconfounding-VAE
conda env create -f environment.yml
source activate env_multiviewVAE
-
all models are in folder:
models
-
optimal architecture from
modelOptimisation
experiments:latentSize = 50
;hiddenlayers = 200
-
XVAE:
XVAE
- Simidjievski, Nikola, et al. "Variational autoencoders for cancer data integration: design principles and computational practice." Frontiers in genetics 10 (2019): 1205.
-
cXVAE:
- input+embed:
- input:
- embed:
- fused+embed:
-
Adversarial Training
-
XVAE with one adversarial network and multiclass predicition:
adversarial_XVAE_multiclass
XVAE_adversarial_multiclass
: inspired by Dincer et al.; training over all batchesXVAE_adversarial_1batch_multiclass
: original by Dincer et al.
XVAE_scGAN_multiclass
: inspired by Bahrami et al.
-
XVAE with multiple adversarial network (one for each confounder):
adversarial_XVAE_multipleAdvNet
- XAE with multiple adversarial network (outdated?)
-
- 1. Select basic model
- Simidjievski, Nikola, et al. "Variational autoencoders for cancer data integration: design principles and computational practice." Frontiers in genetics 10 (2019): 1205.
- https://github.com/CancerAI-CL/IntegrativeVAEs
- The X-shaped Variational Autoencoder (X-VAE) Architecture is overall recommended in this comparative study
- 2. Reform the basic model
- Implement in Pytorch Lightning
- Rearrange code
- Provide two latent loss functions (KL divergence and Maximum Mean Discrepancy)
- Implement testing metrics
- 3. Create a clustering model
- Strategy 1: Run K-means (or other clustering methods) on the latent space
- Strategy 2: Add a term in the loss function to iteratively optimize the clustering quality
- 4. Correct for confounders
-
Strategy 1: Take confounders into account during decoding and the loss function is conditioned on the confounders; Adapted from: Lawry Aguila, Ana, et al. "Conditional VAEs for Confound Removal and Normative Modelling of Neurodegenerative Diseases." Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part I. Cham: Springer Nature Switzerland, 2022.
- Build cVAE - concat covariates to input dim & latent dim
-
Strategy 2: Add a term in loss function to minimize the association/similarity between the latent embedding and confounders
- XVAE with adversarial training; Inspiration from this paper
- simple version; only pre-training
- ping pong training
- XVAE with adversarial training; Inspiration from this paper
-
Strategy 3: Simply remove confounded latent features
-