ChIA-PET Analysis Software
Novel ChIA-PET analysis method reveals two major classes of chromatin interactions Phanstiel DH, Boyle AP, Heidari N, Snyder MP. In Preparation.
- Mango depends on the following R packages.
- hash
- Rcpp
- optparse
They can be installed throug CRAN. For example to install the package 'hash' open R and type the following
install.packages('hash') install.packages('Rcpp') install.packages('optparse')
- Mango depends on the following software pacakges which should be installed and included in the system PATH prior to using Mango.
- Bowtie (http://bowtie-bio.sourceforge.net)
- Bedtools (https://github.com/arq5x/bedtools2)
- MACS2 (https://github.com/taoliu/MACS)
- Once dependencies are installed Mango can be installed from the command line using the following command.
git clone https://github.com/dphansti/mango.git R CMD INSTALL --no-multiarch --with-keep.source mango
Mango uses fastq files generated by illumina sequencers to call peaks and interactions from ChIA-PET experiments. Arguments can be passed to Mango either by a configuration file, through the command line, or a combination of both. In cases where arguments at supplied both through the command line and a configuration file the values passed via command line arguments will take precidence.
Rscript mango.R [-options]
Example for regular interactions calling
Rscript Mango.R --fastq1 samplename_1.fastq --fastq2 samplename_1.fastq --prefix samplename --argfile argfile.txt --chromexclude chrM,chrY --stages 1:5
Example of a argfile
bowtieref = /path/to/hg19 bedtoolsgenome = /path/to/human.hg19.genome
stages
- stages of the pipeline to execute. stage can be either a single stage (e.g 1 or a range of stagnes e.g 1:5). default = 1:5
prefix
- prefix for all output files. default = mango
outdir
- The output direcoroy. default = NULL
bowtieref
- genome reference file for bowtie
bedtoolsgenome
- bedtools genome file
chrominclude
- comma separated list of chromosomes to use (e.g. chr1,chr2,chr3,...). Only these chromosomes will be processed. If NULL all chromosomes with be processed. default = NULL
chromexclude
- comma separated list of chromosomes to exclude (e.g. chrM,chrY). If NULL all chromosomes with be processed. default = NULL
linkerA
- linker sequence to look for. default = GTTGGATAAG
linkerB
- linker sequence to look for. default = GTTGGAATGT
minlength
- min length of reads after linker trimming. default = 15
maxlength
- max length of reads after linker trimming. default = 25
keepempty
- Should reads with no linker be kept (TRUE or FALSE). default = FALSE
shortreads
- should bowtie alignments be done using paramter for very short reads (~20 bp). default = TRUE
MACS_qvalue
- pvalue cutoff for peak calling in MACS2. default = 0.05
MACS_shiftsize
- MACS shiftize. NULL allows MACS to determine it
peakslop
- Number of basespairs to extend peaks on both sides. default = 500
peakinput
- Name of user supplied peaks file. If NULL Mango will use peaks determined from MACS2 analysis. default = NULL
blacklist
- BED file of regions to remove from MACS peaks
distcutrangemin
- When Mango determines the self-ligation cutoff this is the minimum distance it will consider. default = 1000
distcutrangemax
- When Mango determines the self-ligation cutoff this is the maximum distance it will consider. default = 100000
biascut
- Mango exlcudes very short distance PETS since they tend to arise from self-ligation of a single DNA framgent as opposed to interligation of two interacting fragments. To determine this distnce cutoff Mango determines the fraction of PETs at each distance that come from self-ligation and sets the cutoff at the point where the fraction is less than or equal to BIASCUT. default = 0.05
FDR
- FDR cutoff for significant interactions. default = 0.01
numofbins
- number of bins to use for binomial p-value calculations. default = 50
corrMethod
- Method to use for correction of mulitply hypothesis testing. See (http://stat.ethz.ch/R-manual/R-devel/library/stats/html/p.adjust.html) for more details. default = BH
maxinteractingdist
- The maximum disance (in basepairs) considered for interaction. default = 1000000
extendreads
- how many bp to extend reads towards peak. default = 120
FDR
- FDR cutoff for interactions. default = 0.01
minPETS
- The minimum number of PETs required for an interaction (applied after FDR filtering). default = 2
reportallpairs
- Should all pairs be reported or just significant pairs (TRUE or FALSE). default = FALSE