amplab/snap

feature request: customizable quality threshold for -C

eboyden opened this issue · 6 comments

Self-explanatory. Being able to trim low-quality bases from the front and back of reads is nice, but ideally the user would be able to determine what "low-quality" means, especially since the manner in which different sequencing platforms bin quality scores may affect the behavior.

This is 2.0.2.dev.8. Please try it out and see if it does what you wanted.

For the most part it does, thanks. Note that I'm running SNAP on a Linux server, so specifying Phred qualities as ASCII characters was a bit tricky, but using backslashes to interpret them literally worked (e.g. -cc \#\/) - it might be worth mentioning this in the documentation.

I did see a few cases where it didn't seem to take, not sure why. When using -C-+ -cc \#\/, most alignments soft-clipped the trailing / as so:

MN01688:12:000H3MG5N:1:13102:11696:12762 163 chr1 185721 29 119M1S = 186012 410 GAGACAGCGGCGGTTTGAGGAGCCACCTCCCAGCCACCTCGGGGCCAGGGCCAGGGTGTGCAGCACCACTGTACTATGGGGAAACTGGCCCAGAGAGGTGAGGCAGCTTGCCTGGGGTCA FFFAFFFFFFFFFFFFFFFFAFFAAFFFFFAFFAFFFFFAFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFF/

But occasionally they did not:

MN01688:12:000H3MG5N:1:11106:26456:13326 83 chr1 377521 3 120M = 377400 -241 TTGCTGAATGTTAATTCAGAAATGAAATTAAAATTTTAAATTAACAACAAGCAACTTTACAAGAGGAAAAAAAAAAACCTCATTTCCTCCCCACAAAGCCACCTCATGAGCCTGGGTGGT FFFF/FFF/A//FAAFFFA/FFF/FFAF/FF/FFFAFFFAFFFFA//FFAFFFF6/FFAF//FAFFFFF66F6FF/F/F/A6FFFFFF/FFFFFFA/AFF/FFFFFFAFFFFFFFFAFF/

Of course, silly me, thanks for clarifying. Incidentally, a single quote is Phred score 6; presumably that would need to be backslashed?

In 2.0.2