How to mask out the amino acid in that mutation position?

Question

How to mask out the amino acid in that mutation position?

fuxuliu opened this issue 2 years ago · 3 comments

Hello, I have recently read your latest paper "Accurate Mutation Effect Prediction using RoseTTAFold", a great work, and I want to try it, but I do not know how to mask out the amino acid in that mutation position in MSA or what to replace the mutant amino acid with. Could you please provide a complete use case and example data, thank you.

Answer 1 · 2022-11-07T19:00:44.000Z

The way the mutation effect prediction is set up right now is that all positions in the proteins are masked one-at-a-time and the output numpy array contains the predicted probabilities of every amino acid at every position (Lx21). I have added in some example of how to read the output file in the README (under 'RF Joint for mutation effect prediction'). Let me know if you have more questions and whether having an argument that takes in whatever 'mutation_position' you want to specifically evaluate is helpful, i can update the implementation!

Answer 2 · 2022-11-11T02:21:45.000Z

Thanks for your reply, but I still don't understand how to build this.
Could you implement a complete use case? For example, wildtype= 'ABCDEFG', then mutation position is 4, from D to H(starting from 1), mutation= 'ABCHEFG', then what was the first sequence in this input_msa.a3m, and get the final prediction.
If you could implement a use case like this, it would be very nice, thank you.

Answer 3 · 2022-11-11T09:11:46.000Z

I think I understand what you mean, thanks for your relpy.