How is the sequence ID calculated in an efficient manner

Question

How is the sequence ID calculated in an efficient manner

Opened this issue 2 years ago · 1 comments

Hello,
In your excellent paper, a key asspect used is the sequence identity between the artificial and any known natural sequences.
May I ask how this sequence identity is calculated in an effective manner? As it requires to screen all the databases for each sequences.
Many thanks in advance

Answer 1 · 2023-02-01T18:32:39.000Z

These values are calculated using the MMseqs2 tool to find the closest matches between the generated sequences and the protein databases. We report the identity to the top database hit for each generated sequence.