/saliva_methylome

Analysis of the saliva methylome of IVF children

Primary LanguageR

saliva_methylome

In this project we have compared the saliva methylome of 9-year-olds that were born after in vitro fertilisation (IVF) procedures using different IVF culture media, namely G3 (Vitrolife) and K-SICM (Cook). Data were generated using Illumina’s Infinium MethylationEPIC bead chip that profiles around 850,000 CpG sites across the human genome. Furthermore, data from the FLEHS birth cohort were included to compare the methylomes of IVF and naturally conceived neonates.

The manifest_filter directory contains the manifest files used for the targeted analyses and the scripts directory contains the scripts used for data processing, analysis and visualization. A brief description of the contents of the scripts is given below:

Data preprocessing:

  1. Data were first pre-processed using the RnBeads package which applied the following functions:
    • Normalisation using the SWAN method
    • greedycut (p-value threshold 0.05) removal of poor quality sites and samples
    • removal of probes on the sex chromosomes
    • removal of probes containing SNPs
    • removal of probes not in the CpG context
    • removal of probes with missing values in 5% or more of the samples
  2. Sex estimation using the method from the sEst package:
  3. Cell composition estimates are calculated using the IDOL optimized probes from the package EWAStools:

Analysis & visualization:

  1. Principal component analysis & determination of associations between PCs and sample characteristics
  2. Differential methylation analysis (sites) using eBayes moderated mixed effects linear models as implemented in the VariancePartition package
  3. Differential methylation analysis on a regional level using eBayes moderated mixed effects linear models as implemented in the VariancePartition package. Groups of interest: genes, promoters, CpG islands
  4. Visualisation of the differential methylation analyses as volcano plots
  5. Determination of outliers using a threshold-based approach
  6. Application of the iEVORA algorithm to identify differentially variable CpG sites
  7. Combination of the resultant plots from the scripts above for publication