davised/automlsa2

multiple gene copies

Closed this issue · 2 comments

Hi @davised

First off, great tool!!!!!! This is a very very useful tool for me tbh

I am conducting a phylogenetic analysis using the 16S rRNA gene and it is not a single copy gene. As a result, some of the strains studied present 1-6 copies of the gene. So, in the automlsa2 pipiline, what happens? Does the pipeline generates a consensus sequence of all the copies or does it select a specific copy?

Thank you for your input.

Cheers,
Pablo

Hi Pablo,

Thanks for sending the message.

In the instance of multi-copy genes, only the highest scoring blastn/tblastn search will be retained. In this way, the specific target can change depending on the query gene used in the search.

Because of the way automlsa2 gathers the subject sequences, there is no easy way to determine consensus sequences or allow multi-copy hits.

Hopefully this makes sense. Feel free to ask additional questions.

Hi @davised

Thanks for your reply! It makes total sense.

Cheers,
Pablo