Low unique alignment %
gevro opened this issue · 8 comments
Hi,
2 out of 23 samples, all prepared in the same batch, have low unique alignment % (38% and 45%) relative to other samples ~(75% average).
See attached bismark alignment reports for the 2 problematic samples (M3 and M5) relative to a good sample (M6).
Representative FastQC of the samples does not indicate any abnormality in terms of excessive adapter or overrepresented sequences. I don't see any other reason why these two samples would have low unique alignment %. Do you have a suggestion of how to troubleshoot this?
M3.fastqc.pdf
M4.fastqc.pdf
M5.fastqc.pdf
Thanks
Note: One possibility I will investigate is perhaps these two samples had a higher than expected spike-in genome %, accounting for the unmapped reads.
(As a general comment, if you run the deduplicate_bismark
and the bismark_methylation_extractor
afterwards the reports reports by bismark2report
are a lot richer. Even better, running MultiQC (https://multiqc.info/) will aggregate everything into a single report. Also, all HTML files produced by Bismark, FastQC or MultiQC should be shareable, and are much nicer to look at than .pdf)
Now for the problem at hand, I agree that all QC profiles you shared look very similar, and they also look good. Some standard trimming should get rid of the unwanted adapter, so it is not obvious why the samples would behave very differently. I have compiled a few FAQs regarding low mapping efficiency here: https://felixkrueger.github.io/Bismark/faq/low_mapping/
Maybe they can set you on the right path?
Thank you. I had another idea from your FAQ website--these are NEB em-seq libraries. And I forgot to set the Max insert size to 1000. So perhaps those two samples have higher insert sizes and lost more reads due to that.
I see also now there is an nf-core pipeline for bismark with an em-seq preset. So I will just switch to that.
good point. Here are some trimming recommendations for EM-seq (https://felixkrueger.github.io/Bismark/bismark/library_types/#em-seq-neb), and there is preset for the nf-core/methylseq workflow, too (be sure to use the dev
revision as 2.6.0
is a little broken...)
Thanks. How do I use the dev version exactly?
on the command line it is -r dev
( I believe)
Hi, It looks like the dev version is still broken, with at least two major bugs: nf-core/methylseq#406
Any suggestions?
Thanks!
Hi, Seems to be working now, I had to add this to the config: process.stageInMode = 'copy'