Before or after cutadapt?
Closed this issue · 2 comments
Hej!
Thanks a lot for developing this wonderful tool! Before this tool, we just trim our reads by experiences of looking at the sequencing report, which is not very "scientific".
But I am confused about, first, when (in which step) exactly to apply this tool. Should I apply it before or after cutadapt step?
If it is after, only the amplicon will left (without adaptor and primers). Should I change any setting of figaro?
Second, when input forwardprimerlength and reverseprimerlength, should I consider adding the length of tag/index for Illumina sequencing? Or just the original primer length for 16S rRNA gene?
Best Regards!
Tong
Sorry for the late response, I missed this issue earlier. You generally should not trim reads before using the program, since the current version requires that all reads be the same length. The primer length values should cover the primer length as well as any other "technical" or "artifact" sequence that may be at the start of the reads, so any other part of your library prep that isn't target sequence at the beginning should be counted as part of that length (that's just length to left-trim).
If you (like a few others) are using a pipeline that tends to create reads with slightly varying length, where you have to allow some variance in read length due to trimming technical sequence, I am working on a major update to cover that use.
As for a trimmer like cutadapt, if you are removing technical sequence, the DADA2 pipeline should handle that with left and right trim for fixed lengths. For variable lengths where you need to use a dedicated trimmer, the update will take that into consideration and you should remove variable-length technical sequence before running FIGARO. If you have fixed-length primers and other technical sequence and you are running cutadapt to trim for quality, that should not be done at all. FIGARO builds error accumulation models for your reads, and it's likely that trimming off poor-quality sequence will alter those models in ways that may invalidate them.
Sorry for the late response, I missed this issue earlier. You generally should not trim reads before using the program, since the current version requires that all reads be the same length. The primer length values should cover the primer length as well as any other "technical" or "artifact" sequence that may be at the start of the reads, so any other part of your library prep that isn't target sequence at the beginning should be counted as part of that length (that's just length to left-trim).
If you (like a few others) are using a pipeline that tends to create reads with slightly varying length, where you have to allow some variance in read length due to trimming technical sequence, I am working on a major update to cover that use.
As for a trimmer like cutadapt, if you are removing technical sequence, the DADA2 pipeline should handle that with left and right trim for fixed lengths. For variable lengths where you need to use a dedicated trimmer, the update will take that into consideration and you should remove variable-length technical sequence before running FIGARO. If you have fixed-length primers and other technical sequence and you are running cutadapt to trim for quality, that should not be done at all. FIGARO builds error accumulation models for your reads, and it's likely that trimming off poor-quality sequence will alter those models in ways that may invalidate them.
Thanks a lot for the answer!
Best
Tong