Does UniRep support representation for any length protein sequence?
duolinwang opened this issue · 1 comments
Hi,
I see in your tutorial that "Note that training is currently only publicly supported for amino acid sequences less than 275 amino acids as gradient updates for sequences longer than that start to get unwieldy.", For training, the sequence must be less than 275, but when I feed a sequence with a longer sequence and just call the get_rep function to generate the representation, it works. But I still want to make sure it is the correct way to generate the representations for longer protein sequences. Thank you!
Thanks for your question. It quite feasible to get representations for very long sequences, just as you described described using the get_rep method. Make sure you provide the babbler1900.init() model_path="1900_weights" to be using the trained weights rather than a random initialization.
Memory use only explodes during back-propagation, not the forward pass.