Reg: Vsearch instead of Usearch

Question

Reg: Vsearch instead of Usearch

Closed this issue a year ago · 4 comments

Hi!

Would it be possible in the future to support vsearch instead of usearch? It's completely open-source and as fast as compared with usearch.

So you can process a single database file instead of splitting it into 4.

Answer 1 · 2019-06-11T10:13:13.000Z

Hi! Thank you for your message.

I have considered that, but I understand vsearch works only with nucleotide sequences and I implemented UniProtKB (protein sequences) as a source database. Please, let me know if you have any other suggestions, actually I am looking for an option (open-source) for usearch.

Answer 2 · 2019-06-11T10:41:04.000Z

Would something like Diamond/Plast work?

https://github.com/bbuchfink/diamond/ or https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/

I've found that plast and diamond are pretty fast, and can offer outputs in a number of NCBI blast compatible formats.

Maybe we can also look into Kaiju's or Paladin's approach?!

Answer 3 · 2019-06-12T08:02:49.000Z

Hi Harish,
thank you again for your comments.

I have tried Diamond before but I was concerned about its accuracy. I have checked Plast manuscript, it seems the accuracy is higher and also faster than Diamond.

I am implementing Plast on Hayai-annotation. I will also make a docker container using Plast, I understand it is ok to include it on the docker container, right?

Please, feel free to share more ideas.

Best regards,
Andrea

Answer 4 · 2019-06-18T07:21:28.000Z

@kdri-genomics

Apologies for the delay, I was busy with some personal engagements!

Are you getting similar benchmarks to the original idea using usearch though would be the real question?!

You can merge in case its similar or better 👍

Docker should be fine! I'm currently working on a plant annotation, so I'll definitely give it a spin :)

Regards,
Harish