Amino acid sequence has too many "K/E"
kjogr12 opened this issue · 2 comments
Dear ProteinMPNN team,
Thank you very much for your great efforts.I want to know why the designed amino acid sequence contains a large amount of E/K(i.g. EEKEKELKKYAEKLKKEVKDIESIDVKDGEITVKAKKLTEKTKKAI...). It’s looks unusual. The input file(pdb) was utilizes a backbone constructed using RFdiffusion. Is there any solution available?
Thank you for your helps in advance
It's hard to say, you could try adding negative bias to K, and E amino acids when designing. It is known that the model has a bias towards polar residues like K, E for the surface residues when used with low sampling temperatures (0.1). You could also try to increase the sampling temperature.
Thank you for your insightful response. I appreciate your suggestion to add a negative bias to K and E amino acids and to consider adjusting the sampling temperature.
When employing a negative bias, do you have any recommended setting for each amino acid? I am considering configuring the settings based on the amino acid distribution of PDBbench shown in SPDesign(https://www.biorxiv.org/content/10.1101/2023.12.14.571651v1).
I would be grateful if you could share any recommendations you might have.