This is the ChIP-seq mapping pipeline for our lab. The script is still at the very preliminary stage, please let us know how to improve!
- Install snakemake
- Go to your work directory, create and copy/link your fastq files in the fastq folder
- /run_chipseq_vanilla.sh -g hg19 -e [email@gmail.com]. It will process every sample fastqs in the fastq/ directory, and send an e-mail when it is done.
All the output files will be relative to your project directory.
- bam/[name].filt.nodup.srt.bam: the final bam files after filter, duplicate removal, and sort.
- bigWig/[name].filt.nodup.srt.bw: bigWig file for the bam.
- qc: summary statistics for the bam files.
- *.flagstat.qc: summary stats for different steps of bam files.
- [name].*.cc.qc: phantom peak summary.
- [name].*.pbc.qc: PBC summary.
- [name].*.plot.pdf: tag shift size estimate plot.
- logs/: running logs for different parts of the pipeline
- BWA
- samtools
- bedtools
- picard
- phantompeaktools
At its complete form, this pipeline will include work flow for analysis of (1) a single sample, and (2) replicates using the ENCODE IDR method.
2017.05.26: The Snakemake pipeline for single sample is available. 2017.05.17: Currently we have only the workflow for a single sample. And it is not modularized yet.