EVIAN is a suite of tools to perform a differential analysis of the distribution of CpG methylation (via Illumina Methyl Array technology).
This toolbox currently include:
- DNAm_Pct_report: a Rmd script to generate a HTML report about the distribution of CpG methylation in a few samples against a control population. This analysis is mainly focused on a set of known regions provided by the user.
- EVIAN_shiny.R: and a shiny interface to launch the previous scripts without using directly command-lines.
Each script can be directly launched in command-lines: look at the ReadMe.md from the corresponding directories for more information.
-
Install conda via Miniconda (https://docs.conda.io/en/latest/miniconda.html) or Anaconda.
-
Open your terminal window (Mac, Linux) or anaconda prompt (Windows) and launch the following command to install a new conda environment for the EVIAN toolbox:
conda create -n EVIAN_env -c conda-forge -c r -c bioconda r r-essentials r-rmarkdown pandoc r-kableextra bioconductor-rtracklayer r-rcolorbrewer r-reshape2 r-shiny r-shinyfiles r-shinywidgets r-waiter r-optparse r-shinydashboard bioconductor-minfi bioconductor-illuminahumanmethylationepicmanifest bioconductor-illuminahumanmethylation450kmanifest r-biocmanager
- Once the installation is complete, activate the environment previously generated to load the required dependencies before using any script from the toolbox:
conda activate EVIAN_env
The usage of each tool is detailed in the ReadMe.md of each corresponding directory.
Here is detailed the steps to use the shiny interface EVIAN_shiny.R.
- Once you loaded the required dependencies to use the scripts, launch the EVIAN_shiny.R script to open the interface.
Rscript ./path/to/EVIAN_shiny.R
- Each tab on the left corresponds to one of the available tool: select the one that you want to use.
- Follow the required instructions, select the files and conditions you want to use for the analysis.
- Once you finished completing the conditions you wanted to use, click on the "Generate report" or "Launch analysis" button at the bottom of the page.
- A command-line will be generated and launched: the progress can be followed on the terminal window.
Regions to be imported and analyzed must be saved in a tsv (tabulation-separated values) file.
This file must have the 7 following columns (using this following order):
column name | description | example |
---|---|---|
chr | chromosome | chr1 |
start | start position | 9200 |
end | end position | 10200 |
strand | strand | +, - or * |
name | name of the region | DmrA-geneA, DmrB-1234, etc... |
group | group(s) associated to this region | SyndromeA, ImprintedRegionGroup1, etc... |
status | status associated to the corresponding group in this region | "High_Confidence", "Hypothetical", etc... |
It is possible to import different region files at once.
It is also possible to have multiple groups associated to one region:
- 2 or more times the same region with one group per line with its associated status (ex: region 3)
- 2 or more groups (separated by a comma) in the "group" column of the same line (with associated status for each of the groups in the same order and separated by commas) (ex: region 2);
- or a combination of both previous possibilities (ex: region 3).
chr | start | end | strand | name | group | status |
---|---|---|---|---|---|---|
chr1 | 1000 | 1500 | * | region1 | groupA | HighConf |
chr3 | 500 | 750 | + | region2 | groupA,groupB | HighConf,HighConf |
chr4 | 5000 | 10000 | * | region3 | groupB | LowConf |
chr4 | 5000 | 10000 | * | region3 | groupC,groupD | HighConf,LowConf |
...
...