Note that the meta analysis, clumped and conditional analysis results from section 16 have now been uploaded to the SFTP server. These are stored in /srv/sftponly/GoDMC/shared/16.tar
. To incorporate into your repository:
- Update repository with
git pull
and navigate to theresults/
folder - Delete or
mv 16/ 16_old
- SFTP into server
sftp username@filetrn-scmv-d0.epi.bris.ac.uk
- Download the results
get shared/16.tar
- Untar
tar xvf 16.tar
- If you type
git status
it should not show any staged changes. Main files in here are16_clumped.rdata
,16_conditional.rdata
and the meta analysis results16_*.txt.gz
This performs the meta analysis of the putative mQTLs from section 16 of the pipeline.
Each cohort_16.tgz
file contains 962 results.gz
files, which are in GWAMA format. All associations in each sub file are the same across cohorts, so to meta analyse we just need to extract the files to the right location, create the metal
script, and run.
It is currently setup such that the cohort_16.tgz
files need to be put into a directory somewhere and then gunzipped. Note - not untarred, e.g.
gunzip cohort_16.tgz
to produce cohort_16.tar
. The reason for this is that the script extracts files from that tar file as it is needed, which is faster than doing it from a tgz file, and more space efficient than extracting everything.
Once you have a directory with all the cohort_16.tar
files, then run
cd 01_meta_analysis_16
./run_metal.sh <batch number>
or
qsub run_metal.sh
Explanation of the output:
# Fixed effects
Effect
StdErr
pval
# Heterogeneity stats
Direction
HetISq
HetChiSq # This is Q statistic
HetDf
HetPVal
tausq
# Additive random effects (DerSimonian-Laird estimator)
EffectARE
StdErrARE
PvalueARE
# Multiplicative random effects
# The Effect size for this is the same as the Fixed effects
StdErrMRE
PvalueMRE
Same as in 01, except the files sent by the cohorts are much larger (e.g. 20-30gb each). Each sub result file is in a specific binary format that is very space efficient. So the script extracts those files and runs an R script to convert the binary file into GWAMA format. The process continues as previously described from here.
Again, need to gunzip
the cohort_17.tgz
files to create cohort_17.tar
files, and then
cd 02_meta_analysis_17
./run_metal.sh <batch number>
or
qsub run_metal.sh
Performing clumping on the results of 01. Using a p-val threshold of 1e-4 for cis and 5e-8 for trans. Using a radius of 1Mb from CpG to denote cis/trans. This uses the European samples in the 1000 genomes data as an LD reference panel.
To run:
cd 03_clumping_16
./run_clumping.sh <batch number>
or
qsub run_clumping.sh
Once it is finished run
Rscript aggregate_clumps.r
to create the file results/16/16_clumped.rdata
which contains all the clumped results
To run this on bluecrystal 4 use
sbatch run_clumped_bc4.sh
Note: Get annotations for 450k CpGs
library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
data(IlluminaHumanMethylation450kanno.ilmn12.hg19)
anno <- IlluminaHumanMethylation450kanno.ilmn12.hg19@data$Other
Performing conditional analysis on the results of 01. Same p-val thresholds as in 03. Using a larger reference sample (4000 samples in ALSPAC, see make_alspac_reference_dataset.sh
on details of how to make it.)
To run on bc4
cd 04_conditional_16
./run_conditional_bc4.sh <batch number>
or
sbatch run_conditional_bc4.sh
See documentation within each of the subsequent sections for details.