Scripts for multi-omics integration
This script performs unsupervised analyses (clustering) from transformed expression data (e.g., log fpkm) and methylation beta values
This R script requires the following packages:
- iClusterPlus
- gplots
- lattice
Rscript integration_unsupervised.R [options]| PARAMETER | DEFAULT | DESCRIPTION |
|---|---|---|
| -d | NULL | File with somatic mutation data |
| -C | NULL | File with copy number variation data |
| -r | NULL | File with expression data |
| -m | NULL | File with methylation data (beta values) |
| -k | 2 | Minimum number of clusters |
| -K | 6 | Maximum number of clusters |
| -c | 2 | Number of cores |
| -o | out | output prefix |
| -h | Show help message and exit |
For example, one can type
Rscript integration_unsupervised.R -r expression_matrix.txt -o output/The script involves 3 steps
- Data transformation of methylation beta values, using the logit function
- Clustering across a range of LASSO lambda penalties and for each number of clusters K using iClusterPlus
- Selection of the best lambda value (BIC) for each K, and plot of the R^2 as a function of K to help the choice of K
- Selection of the top features differentiating the clusters
- A figure with R^2 as a function of K, and cluster memberships of each sample as a function of K
In addition, for each value of K:
- an .RData file with clustering results
- a heatmap with the top features for each dataset
- a .txt file with the name of the top features for each dataset