Target pos
LiviaPham opened this issue · 8 comments
Hi,
I'm trying to run Umol but currently I'm stuck at the "Predict" step and don't know how to get the "target_pos $POCKET_INDICES" data. Can you help me?
Thanks for your program and I hope to receive your response.
Best wishes,
Livia.
Hi,
The target positions are defined as all CBs within 10Å from any ligand atom in your binding site. You therefore need to know your binding site.
If you look at the example here: https://colab.research.google.com/github/patrickbryant1/Umol/blob/master/Umol.ipynb
you can see the target positions in stick format.
Hope this helps!
Well,
I'm extremely grateful for your instructions. Maybe my description is not good. But what I'm really stuck on is how to determine the "target_pos $POCKET_INDICES", or binding site ligand-protein what didn't research before, like your example:
TARGET_POSITIONS:
51,52,54,55,56,57,58,59,60,61,62,63,65,66,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,93,94,95,96,97,98,99,100,101,102,104,105,125,128,129
Do you use a database or some other tool to finding this information that can share with me?
Have a nice day,
Livia.
Hi,
This information has to be provided by you. Maybe look in the PDB for similar proteins with known ligands and take the site from there.
I realise that this field may be new to you (?). In general, a target site is predetermined for drug development (how else do you know you want to drug that site?). If the inverse problem is true and you have a drug you know binds to something but not how I recommend getting a crystal structure.
Hope this helps!
Oh I understand.
I am really a newbie in this field. I have recetnly join a course and I am trying to read publication to understand it. If you have a step-by-step guide or instruction to figure out target position in this example, please help me. I am now trying to apply your approach in a new protein-ligand.
Many thanks to you.
Livia.
What is your protein?
Hi,
My ligand SMILES: CC1=C(Cl)C=C(NC(=O)NCC2=CC=C3C(=O)N(CC3=C2)C2CCC(=O)NC2=O)C=C1
My protein in Uniprot: Q96SW2
Thank you very much.
Livia.
Hi,
Since you only have one ligand, I am not sure it is meaningful to use Umol. It is better to go to the lab. If you can't/don't want to do that I suggest perhaps focusing on another research topic as predictions will only get you so far.
Still, I provide a guide here:
If you search your Uniprot ID and look at available structures you can see that this is available with a bound peptide: https://www.rcsb.org/3d-view/4M91/1
You can now download this structure and extract all CBs in the protein (chain A) that are within 10Å from the peptide (chain B). These are your target residues:
47,48,49,50,51,53,54,55,56,57,58,59,60,61,89,90,91,94,97,98,102,103,104,105,106,107
sequence: SKKENLLAEKVEQLMEWSSRRSIFRMNGDKFRKFIKAPPRNYSMIVMFTALQPQRQCSVSRQANEEYQILANSWRYSSAFSNKLFFSMVDYDEGTDVFQQLNMNSAPTFMHFPPKGRPKRADTFDLQRIGFAAEQLAKWIADRTDVHIRVFRL
Plug this into Umol and you will obtain a prediction. Note that doing research with AI-tools without really knowing what you are looking for or why is not recommended.
If you do this you get the following:
The average ligand plDDT is: 54.2
This is quite low and the complex is probably inaccurate.
Running single predictions is the intended use for Umol. I recommend doing a large-scale screen towards your binding site and then verifying whatever you find in the lab. The inverse process you are currently applying is not very logical since you will have to go to the lab regardless.