mikessh/migec

Documentation of the max-offset setting is unclear

gustavjoh opened this issue · 1 comments

I try to understand what the --overlap-max-offset X setting control. We do Pair end reads 150+150, and our amplicons are roughly 200bp, so I assume that our read-through is high.

What is meant with offset in this context? Or can there be more examples of what max-offset to apply in different scenarios?

From the documentation, it can be read:
--overlap-max-offset X controls to which extent overlapping region is searched.
IMPORTANT If the read-through extent is high (reads are embedded) should be set to ~40.

I think it would be great if the documentation were a bit clearer at this point. Thanks for a great tool!

I have the same question. My best guess would be that the offset parameter sets the limit on the no. of nucleotides that a given set of overlapping reads can differ by. I worked with overlapping mimotope peptides, and they define "offset" in a similar fashion. For example, lets say I have two 15-mers that overlap by 11 amino acids, the offset here would be 15-11 = 4 (4 being the no. of amino acids that do not overlap). Hope this helps.