OpenGene/fastp

fastp not removing all Illumina universal adapter sequences as indicated by FastQC

luckyvivi opened this issue · 5 comments

Hi, I recently ran fastp on an Illumina dataset with the following command:
fastp -i SRR18278237.fastq.gz -o SRR18278237.fastp.gz -z 9 -l 15 -w 16 --dedup --dup_calc_accuracy 6 -x -3 --cut_mean_quality 20 -j SRR18278237.fastp.json -h SRR18278237.fastp.html

I expected that this command would remove the Illumina universal adapter sequences from the reads. However, after running FastQC on the output files, I'm still seeing a significant adapter content in the FastQC report, specifically towards the end of the reads (please see attached screenshot).
image

Could you please help me understand the following:

  1. Is there a possibility that fastp might not remove some of the adapter sequences under certain conditions?
  2. Do I need to specify the adapter sequences explicitly using the -a option, even though these are standard Illumina universal adapters?
  3. Is there anything in my fastp command that might have prevented the adapter sequences from being adequately detected and trimmed?

I have attached the JSON and HTML reports from fastp for your reference. I would greatly appreciate any insights or suggestions you might have to resolve this issue.

Thank you for your assistance and for developing such a useful tool.

Best regards,
Xiaowen
Uploading SRR18278237 (1).fastp.zip…

I have a similar issue, but with Nextera adapters. fastp says no contamination, FastQC says nextera, up to 10% by the read end. Even when I supply the Nextera fasta file (the one provided by trimmomatic) virtually no trimming happens.

Trimmomatic with ILLUMINACLIP:"${ADAPTERS}":2:30:10 SLIDINGWINDOW:4:25 MINLEN:45 and drops 7.25% of all reads.

This isn't a perfect comparison, I think fastp default min window Q is 20, not 25, but still. Something seems off here. I'm using v0.23.2.