A Nextflow pipeline to align and quantify Methylation (Bisulfite) sequencing data.
The pipeline was created to run on the ETH Euler cluster and it relies on the server's genome files. Thus, the pipeline needs to be adapted before running it in a different HPC cluster.
- FastQC
- FastQ Screen
- Trim Galore
- FastQC
- Bismark
- Bismark filter non-conversion [Optional]
- Bismark deduplication
- Bismark methylation extractor
- coverage2cytosine [Optional]
- Bismark2report
- Bismark2summary
- MultiQC
Path to the folder where the FASTQ files are located.
--input /cluster/work/nme/data/josousa/project/fastq/*fastq.gz
Output directory where the files will be saved.
--outdir /cluster/work/nme/data/josousa/project
-
Reference genome used for alignment.
--genome
Available genomes:
GRCm39 # Default GRCm38 GRCh38 GRCh37 panTro6 CHIMP2.1.4 BDGP6 susScr11 Rnor_6.0 R64-1-1 TAIR10 WBcel235 E_coli_K_12_DH10B E_coli_K_12_MG1655 Vectors Lambda PhiX Mitochondria
-
Option to use a custom genome for alignment by providing an absolute path to a custom genome file.
--custom_genome_file '/cluster/work/nme/data/josousa/project/genome/CHM13.genome'
Example of a genome file:
name GRCm39 species Mouse bismark /cluster/work/nme/genomes/Mus_musculus/Ensembl/GRCm39/Sequence/BismarkIndex/
- Option to provide a custom FastQ Screen config file.
--fastq_screen_conf '/cluster/work/nme/software/config/fastq_screen.conf' # Default
-
Option to set the alignment mode to local.
--local
In this mode, it is not required that the entire read aligns from one end to the other. Rather, some characters may be omitted (“soft-clipped”) from the ends in order to achieve the greatest possible alignment score.
-
Option to write all reads that could not be aligned to a file in the output directory.
--unmapped
-
Option to write all reads which produce more than one valid alignment with the same number of lowest mismatches or other reads that fail to align uniquely to a file in the output directory.
--ambiguous
-
Option to skip FastQC, TrimGalore, and FastQ Screen. The first step of the pipeline will be the Bismark alignment.
--skip_qc
-
Option to skip FastQ Screen.
--skip_fastq_screen
-
Option to skip Bismark deduplication.
--skip_deduplication
-
Option to add Bismark filter non-conversion before deduplication (if selected) and before Bismark methylation extractor.
--add_filter_non_conversion
-
Option to add extra arguments to FastQC.
--fastqc_args
-
Option to add extra arguments to FastQ Screen.
--fastq_screen_args
-
Option to add extra arguments to Trim Galore.
--trim_galore_args
-
Option to add extra arguments to Bismark.
--bismark_arg
-
Option to add extra arguments to Bismark filter non-conversion.
--filter_non_conversion_args
-
Option to add extra arguments to Bismark deduplication.
--deduplicate_bismark_args
-
Option to add extra arguments to Bismark methylation extractor.
--bismark_methylation_extractor_args
-
Option to add extra arguments to Bismark coverage2cytosine.
--coverage2cytosine_args
-
Option to add extra arguments to Bismark2summary.
--bismark2summary_args
-
Option to add extra arguments to Bismark2report.
--bismark2report_args
-
Option to add extra arguments to MultiQC.
--multiqc_args
This pipeline was adapted from the Nextflow pipelines created by the Babraham Institute Bioinformatics Group and from the nf-core pipelines. We thank all the contributors for both projects. We also thank the Nextflow community and the nf-core community for all the help and support.