bokulich-lab/RESCRIPt

ENH: extract-seq-segments add param "min_coverage"

Closed this issue · 1 comments

The Problem

extract-seq-segments does not have a min_coverage threshold exposed, so it is possible to extract sequences that may pass the %identity threshold but actually have very low coverage (I do not know off-hand what the default is, but we should set an explicit default either way).

Solution

Expose a min_coverage parameter to specify the minimum amount of coverage of the reference that is required for a hit to pass.

This code should be relatively easy to expose — there is precedent with the vsearch-global action in q2-feature-classifier that can be used as a template here. The one difference (I think) is that in q2-feature-classifier the min_coverage parameter is about min coverage of the query; whereas in extract-seq-segments this should be the min coverage of the reference, which is the shorter sequence in this case that is used to recruit additional sequences.

resolved by #199