fastp not removing all Illumina universal adapter sequences as indicated by FastQC
luckyvivi opened this issue · 5 comments
Hi, I recently ran fastp on an Illumina dataset with the following command:
fastp -i SRR18278237.fastq.gz -o SRR18278237.fastp.gz -z 9 -l 15 -w 16 --dedup --dup_calc_accuracy 6 -x -3 --cut_mean_quality 20 -j SRR18278237.fastp.json -h SRR18278237.fastp.html
I expected that this command would remove the Illumina universal adapter sequences from the reads. However, after running FastQC on the output files, I'm still seeing a significant adapter content in the FastQC report, specifically towards the end of the reads (please see attached screenshot).
Could you please help me understand the following:
- Is there a possibility that fastp might not remove some of the adapter sequences under certain conditions?
- Do I need to specify the adapter sequences explicitly using the -a option, even though these are standard Illumina universal adapters?
- Is there anything in my fastp command that might have prevented the adapter sequences from being adequately detected and trimmed?
I have attached the JSON and HTML reports from fastp
for your reference. I would greatly appreciate any insights or suggestions you might have to resolve this issue.
Thank you for your assistance and for developing such a useful tool.
Best regards,
Xiaowen
Uploading SRR18278237 (1).fastp.zip…
I have a similar issue, but with Nextera adapters. fastp says no contamination, FastQC says nextera, up to 10% by the read end. Even when I supply the Nextera fasta file (the one provided by trimmomatic) virtually no trimming happens.
Trimmomatic with ILLUMINACLIP:"${ADAPTERS}":2:30:10 SLIDINGWINDOW:4:25 MINLEN:45
and drops 7.25% of all reads.
This isn't a perfect comparison, I think fastp default min window Q is 20, not 25, but still. Something seems off here. I'm using v0.23.2.