qiime2/q2-feature-classifier

DNNs for feature classification

Closed this issue · 6 comments

Improvement Description
Crazy idea.

What if we implemented a deep convolutional neural network for taxonomy assignment?

Comments
A couple of immediate advantages come up to mind

  1. DNNs are likely to be much more accurate than random forests or naive bayes. There is loads of literature on this topic, and this is starting to become more popular in the context of sequence classification (see here just for an example).
  2. Using something like Tensorflow or Pytorch would yield immediate gains in speed up, since they are packaged with multi-threading / GPU support. I'm writing this as I'm approaching 2 hours for running taxonomy assignment for 30k sequences. There also are other advantages worth considering such as mini-batching, model diagnostics, ...

References
here

I like it; this has been on the back of my mind for a while. Sounds like you already have a DNN taxonomy classifier prototype... if your experiment works might I propose we benchmark in tax-credit and go from there?

@mortonjt, still interested in this? I like the plan - I suspect we just need a developer.

Hi both, we already tried this:

https://www.frontiersin.org/articles/10.3389/fmicb.2021.644487

In summary we found that NNs and some other approaches did no better than NB, and even found some reasons for it.

You may be able to do better, but I would hope that our work would mean that you're not starting from scratch.

Thanks everyone. I'm thinking I'll close this one out since it sounds like we don't have plans to do this now. We can move any ongoing conversation to the forum. (Feel free to reopen if someone does want to work on moving this forward in this plugin.)