debbiemarkslab/DeepSequence

The sequences of the wildtype for the DMS datasets

Opened this issue · 0 comments

Hi,
Really appreciate your work, it is very helpful to what I am working on right now.
I am kinda newbie in deep learning of protein sequences, forgive me if i am asking silly questions.
Thanks for providing the DMS in the supplemental in your paper 'Deep generative models of genetic variation capture the effects of mutations'. I wonder IF you have the raw sequences of the wildtype protein for each of the DMS datasets.

  1. If you already have the sequences, can you point to me somewhere?
  2. if not, were you using the supplemental table 1 to extract the sequences using UniProt ID? if so, can you point to me the code to extract the sequences? or did you do it outside using the UniProt website?

Thanks a lot! Happy holidays