MeRIP-seq analysis pipeline arranged multiple alignment tools, peakCalling tools, Merge Peaks' methods and methylation analysis methods.
Here, we present MeRIPseqPipe, an integrated analysis pipeline for MeRIP-seq data based on Nextflow. It integrates ten main functional modules including data preprocessing, quality control, read mapping, peak calling, peak merging, motif searching, peak annotation, differential methylation analysis, differential expression analysis, and data visualization, which covers the basic analysis of MeRIP-seq data.
All the analysis modules are generated by Nextflow, and all the third-party tools are encapsulated in the Docker container.
-
install
nextflow
-
pull
docker
image from dockerhub:kingzhuky/meripseqpipe:dev
-
cloning this repository
git clone https://github.com/canceromics/MeRIPseqPipe.git nextflow run /path/to/MeRIPseqPipe --help
-
test it on a minimal dataset with a single command
nextflow run path/to/meripseqpipe -profile test,docker
-
Start running your own analysis!
nextflow run path/to/meripseqpipe -profile docker --designfile designfile.tsv --comparefile compare.txt -resume --aligners star --fasta hg38_genome.fa --gtf gencode.v25.annotation.gtf --rRNA_fasta hg38_rRNA.fasta --outdir path/to/results --skip_createbedgraph --peakMerged_mode rank --star_index hg38/starindex --skip_meyer --skip_matk --methylation_analysis_mode Wilcox-test
See usage docs for more details and all of the available options when running the pipeline.
The MeRIPseqPipe documentation is split into the following files:
- Usage
- Parameter Documentation
- An overview of how the pipeline works, how to run it and a description of all of the different command-line flags.
- let us know if you need more customization!
- Output
- An overview of the different results produced by the pipeline
This pipeline is built using Nextflow and integrates tools as follows:
- Quality control and preprocessing of raw data
- Read alignment
- Peak calling
- Peak merging
- RobustRankAggreg: a rank aggregation algorithm
- MSPC: using combined evidence from replicates to evaluate ChIP-seq peaks
- BEDTools: using "mergeBed" and "intersectBed" function
- Peak annotation
- Perl scripts: peak start/end position, gene start/end position, transcript ID, strand, gene type (coding or noncoding, lncRNA or mRNA, etc.), peak location, gene ensemble ID, etc.
- annotatePeaks.pl: whether a peak is in the TSS (transcription start site), TTS (transcription termination site), Exon (Coding), 5' UTR Exon, 3' UTR Exon, Intronic, or Intergenic and also shows the distance to TSS
- Motif searching
- HMOER: Hypergeometric Optimization of Motif EnRichment
- M6A sites predicition
- MATK: predict m6A sites at single nucleotide resolution
- Differential expression analysis
- featureCounts: read counting relative to gene biotype
- DESeq2: for differential expression analysis of RNA-Seq, SAGE-Seq, ChIP-Seq or HiC count data
- edgeR: for differential expression analysis of RNA-Seq, SAGE-Seq, ChIP-Seq or HiC count data
- Differential methylation analysis
- QNB: a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data
- MATK: using a Bayesian hierarchical model to eliminate the effect of basal expression and quantify the true m6A level by Markov Chain Monte Carlo sampling
- Wilcox-test: results are generated by custom R scripts referred to RPKM methods
- DESeq2: use a generalized linear model to detect changes in IP coverage while controlling for differences in Input coverage
- edgeR: use a generalized linear model to detect changes in IP coverage while controlling for differences in Input coverage
- Report
- MultiQC: summarize all results from quality control and alignment
- R packages
MeRIPseqPipe was originally written by Xiaoqiong Bao, Kaiyu Zhu.
If you would like to contribute to this pipeline, please see the contributing guidelines.
MeRIPseqPipe has been registed to BioTreasury(https://biotreasury.rjmart.cn/#/tool?id=61140), welcome to use and comment!
Xiaoqiong Bao, Kaiyu Zhu, Xuefei Liu, Zhihang Chen, Ziwei Luo, Qi Zhao, Jian Ren, Zhixiang Zuo, MeRIPseqPipe: an integrated analysis pipeline for MeRIP-seq data based on Nextflow, Bioinformatics, 2022;, btac025, https://doi.org/10.1093/bioinformatics/btac025.
Thanks to nf-core for the support and guidance!
You can cite the nf-core publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.