smaegol/PlasFlow

Synthetic plasmid detection

cinaljess opened this issue · 2 comments

Hello PlasFlow,

I notice you train the model on RefSeq references. In my use-case I would like to identify and reconstruct plasmids from NGS data, but all of my plasmids are synthetic constructs and probably don't resemble any of the references in RefSeq.
My question is, where should I look to re-train the model on a reference containing my plasmid set and then be able to identify it from millions of NGS reads? Or, will the accuracy of the tool correctly identify synthetic constructs as is?

Thank you,
Alexander

Hi,

It's hard to answer your question. First, I think that PlasFlow may be able to catch some of synthetic constructs but it should be checked. You should make some benchmark on it. Second - training can definitely be done in respect for that but PlasFlow doesn't support training yet. If you plan to train a neural network yourself remember that you need both positive (plasmid) objects as well as negative so it's really important to properly prepare the training set.

Thank you for the suggestion Smaegol.