Q: What is recommended command line for ATAC-Seq peak calling with bams from Paired End data
tamuanand opened this issue · 1 comments
Use case
I have paired end ATACSeq data (hg38) and I would like to know the recommended command line for ATAC-Seq peak calling with bams from Paired End data
Describe the problem
There are too many threads, discussion topics etc and lack of a clearcut answer makes people take piecemeal command line arguments from different isuses, etc.
Hence my questions:
- What's the recommended CLI for my use case?
- Should read shifting be compulsorily done (+4/-5) or is it purely optional
Describe the solution you tried
Based on the above, I have tried this below and want to make sure with @EvanTarbell and @taoliu if I am not missing anything
- do not do any shifting before peak calling with the assumption that the macs2 callpeak defaults will take care of this
- generate $no_chrM_and_mapq_filtered_blacklist_removed_bam: start with bam after bwa-mem, remove chrM + filter for mapq < 5 + exclude blacklisted regions using data from here: https://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/
- Use this command line:
macs2 callpeak -f BAMPE -g hs --keep-dup 1 --min-length 100 --call-summits -q 0.05 --bdg -t $no_chrM_and_mapq_filtered_blacklist_removed_bam
Additional context
Based on reading many different sources, this seems to be the closest to a recommended answer - #145 (comment). However,
- https://twitter.com/XiChenUoM/status/1336658454866325506 suggests
convert paired BAM to simple BED file and use "-f BED --shift -100 --extsize 200"
and then a counter to it from @EvanTarbell here - https://twitter.com/epigeneticsnerd/status/1337081679589101576 suggests that
BAMPE and --call-summits
is enough with macs2
I know there are many recommendations like "switch to macs3, hmmratac, Genrich" - I would still like to know how to use macs2 callpeak
given what I describe above.
Thanks in advance.