The sequences of the wildtype for the DMS datasets
Opened this issue · 0 comments
lzhangUT commented
Hi,
Really appreciate your work, it is very helpful to what I am working on right now.
I am kinda newbie in deep learning of protein sequences, forgive me if i am asking silly questions.
Thanks for providing the DMS in the supplemental in your paper 'Deep generative models of genetic variation capture the effects of mutations'. I wonder IF you have the raw sequences of the wildtype protein for each of the DMS datasets.
- If you already have the sequences, can you point to me somewhere?
- if not, were you using the supplemental table 1 to extract the sequences using UniProt ID? if so, can you point to me the code to extract the sequences? or did you do it outside using the UniProt website?
Thanks a lot! Happy holidays