a-h-b/dadasnake

reverse complement code is wrong in `cutadapt.smk` and `cutadapt.single.smk` (solution provided)

Opened this issue · 1 comments

Hello! I discovered a bug when using primer sequence with small caps which was not reverse complementing properly. This was causing cutadapt to truncate some random 10 base pairs from my reads. I hadn't discovered this bug until I saw the cutadapt log and presumably people would rarely use small caps so they wouldn't encounter this.

The issue is in these lines

FWD_RC=`echo {config[primers][fwd][sequence]} | tr '[ATUGCYRSWKMBDHNatugcyrswkbdhvn]' '[TAACGRYSWMKVHDBNtaacgryswmkvhdbn]' |rev`

You can see that the tr command's regex (1st arg) and substitutions (2nd arg) are not of the same length. This us because of a few missing letters..
ATUGCYRSWKMBDHNatugcyrswkbdhvn
TAACGRYSWMKVHDBNtaacgryswmkvhdbn

This needs to be fixed by changing the tr's first argument (regex) from
ATUGCYRSWKMBDHNatugcyrswkbdhvn to
ATUGCYRSWKMBDHVNatugcyrswkmbdhvn in multiple occurrences of lines FWD_RC= and RVS_RC= in both cutadapt smk files.

In summary, this is the modified tr code
tr '[ATUGCYRSWKMBDHVNatugcyrswkmbdhvn]' '[TAACGRYSWMKVHDBNtaacgryswmkvhdbn]' |rev

Bumping this up for attention @a-h-b?

You can just copy the new line from line 105 onwards in the commit I referenced to fix it. My branch has diverged far from yours so I couldn't make a pull request.