If possible could the code for evotuning and the weights for eUniRep tuned to TEM-1 beta lactamase be shared?
ivanjayapurna opened this issue · 4 comments
Some code and/or a concise notebook on how to perform the evotuning would be greatly appreciated. We're interested in retraining UniRep with a particular set of sequences and we already have a pipeline to perform the embeddings and test them on different top models and ML tasks. We are really curious to see whether the evotuned UniRep embeddings can improve our predictions and how it compares to other sequence descriptors such as 1-hot, z-scales, fingerprints, etc.
I suspect evotuning is of higher interest to academia & industry than the current example on how to train a top model using the original UniRep.
Since there seems to be some level of inactivity by the devs, I will suggest a very simple & fast re-implementation of UniRep in JAX by Novartis scientists that seems to be working for me/us at the moment: JAX-UniRep
It can embed multiple sequences at once and performs much faster (at least one order of magnitude faster). They also provide an interface for performing the evotuning which is exactly what I and many other non-ML specialists needed.
Hope it is useful.
Hey @jmahenriques I'm currently working with the creators of JAX-UniRep to fix a bug with the evotuning function: ElArkk/jax-unirep#37
I was wondering if in your use of the library you encountered these issues?
Hi @ivanjayapurna! Nice, I really like their implementation and enjoyed reading the technical paper. I am definitely planning to retrain the model using JAX-UniRep, but I can't allocate the needed time and resources until it is clear to us whether we can make use of the pre-trained weights for both embedding and evotuning or not. They seem to be protected against commercial use and I work at a pharmaceutical company. Worst case scenario we might need to train the model from scratch, reproducing the original UniRep paper in order to go around the license. I will take a look at the issue you mention and if I encounter such issues, I'll make sure to communicate it. Thanks!