NaN output for energy/RMSD using Wictor
Closed this issue · 4 comments
Versions:
Python 3.9.17 (main, Jul 5 2023, 20:41:20)
[GCC 11.2.0]
on Linux x86_64
with Fragmenstein 0.14.5 and PyRosetta 2023.40+release.96fa3c5
When performing placements with Wictor:
def place_products(self) -> pd.DataFrame:
"""
This function places products using Fragmenstein.
"""
# set up Wictor
self.setup_Fragmenstein()
# Get pdbblock from template_path
with open(self.template_path) as fh:
pdbblock: str = fh.read()
lab = Laboratory(pdbblock=pdbblock, covalent_resi=None, run_plip=False)
placements: pd.DataFrame = lab.place(place_input_validator(self.input_df), n_cores=self.n_cores,
timeout=self.timeout)
return placements
def setup_Fragmenstein(self):
"""
This function sets up Fragmenstein to run.
"""
# Using Wictor to place just by RDKit minimisation
# make output directory
self.output_path: str = os.path.join(self.output_dir, 'output')
os.makedirs(self.output_path, exist_ok=True)
Wictor.work_path = self.output_path
os.chdir(self.output_dir) # this does the trick
Wictor.monster_throw_on_discard = True # stop this merger if a fragment cannot be used.
Wictor.monster_joining_cutoff = 5 # Å
Wictor.quick_reanimation = False # for the impatient
Wictor.error_to_catch = Exception # stop the whole laboratory otherwise
Wictor.enable_stdout(logging.CRITICAL)
#Wictor.enable_logfile(os.path.join(self.output_path, f'fragmenstein.log'), logging.ERROR)
Laboratory.Victor = Wictor
I get this minimised.json
output for all placements:
{"Energy": {"bound": {"total_score": NaN, "unit": "kcal/mol"}, "unbound": {"total_score": NaN, "unit": "kcal/mol"}, "ideal": {"total_score": NaN, "unit": "kcal/mol"}, "insitu": {"total_score": NaN, "unit": "kcal/mol"}, "apo": {"total_score": NaN, "unit": "kcal/mol"}}, "mRMSD": 0.0, "RMSDs": [0.0]}
Thanks for looking into it!
The Fragmenstein version is a hot-fix version (H issue) for Wictor... so I was worried there would be an issue!
I can replicate the error and it does happen only in Wictor.
import pkg_resources, sys, platform
get_version = lambda name: pkg_resources.get_distribution(name).version
print(f'Python {sys.version}\n'+
f'on {platform.system()} {platform.machine()}\n'+
f'with Fragmenstein {get_version("fragmenstein")} and PyRosetta {get_version("pyrosetta")}'
)
from typing import List
from rdkit import Chem
from fragmenstein import Wictor, Monster, Victor, Igor, demo
from IPython.display import display
from ipywidgets import Output
Victor.enable_stdout(40) # ERROR
Victor.capture_logs()
hit: Chem.Mol = demo.Mac1.get_mol('diamond-x0662_A')
#hits: List[Chem.Mol] = demo.Mac1.get_n_filtered_mols(5)
pdbblock: str = demo.Mac1.get_template()
display(hit)
# Monster
monstah = Monster([hit])
monstah.place_smiles(Chem.MolToSmiles(hit))
print('Monster:')
display(monstah.positioned_mol)
# Victor
with Output() as output:
# There is no way to mute the licence stdout
Igor.init_pyrosetta()
vicky = Victor([hit], pdb_block=demo.Mac1.get_template())
vicky.place(Chem.MolToSmiles(hit))
print('Victor:', vicky.summarize())
display(vicky.minimized_mol)
# Victor w/o PyRosetta
wicky = Wictor([hit], pdb_block=demo.Mac1.get_template())
wicky.place(Chem.MolToSmiles(hit))
print('Wictor:', wicky.summarize())
display(wicky.minimized_mol)
Gives me this (please ignore my usage of the Neon Jupyter theme):
The error [2024-02-07 09:01:11,772] ERROR - MMFF cannot work on a molecule that has errors!
oddly happens with Wictor not Victor. I'll look into it now.
Fixed in 0.14.6
. I will not upload it to pypi because I haven't got round to completing the migration from nglview to py3dmol.
The issues was to do with the neighbourhood not being sanitised before the hydrogens were added (in Monster.get_neighborhood
), which I had changed to address in the hotfix, which was raising some side errors.
The values are not nan and are in the right order of magnitude, which is okay for Wictor, which is a crude hack.
Wictor: {'name': 'ligand', 'smiles': 'Cn1ccc(C(=O)Nc2cccc(C(=O)[O-])c2)n1', 'error': '', 'mode': 'expansion', '∆∆G': -12.346298146748609, '∆G_bound': 207.17850845226974, '∆G_unbound': 219.52480659901835, 'comRMSD': 0.4184665717167791, 'N_constrained_atoms': 18, 'N_unconstrained_atoms': 0, 'runtime': 0.3411738872528076, 'regarded': ['diamond-x0662_A'], 'disregarded': []}
Victor: {'name': 'ligand', 'smiles': 'Cn1ccc(C(=O)Nc2cccc(C(=O)[O-])c2)n1', 'error': '', 'mode': 'expansion', '∆∆G': -3.5929845730282706, '∆G_bound': -3.08374601698008, '∆G_unbound': 0.5092385560481905, 'comRMSD': 0.21786527437335607, 'N_constrained_atoms': 18, 'N_unconstrained_atoms': 0, 'runtime': 6.5042548179626465, 'regarded': ['diamond-x0662_A'], 'disregarded': []}
In terms of outcome, it's nearly identical:
import py3Dmol
viewer = py3Dmol.view(width=1600, height=350)
neighborhood = vicky.monster.get_neighborhood(vicky.apo_pdbblock, cutoff=vicky.settings['ff_neighborhood'], addHs=True)
viewer.addModel(Chem.MolToMolBlock( neighborhood ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'gainsboroCarbon', 'opacity': 1}})
viewer.addModel(Chem.MolToMolBlock( monstah.positioned_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'goldCarbon', 'opacity': 0.5}})
viewer.addModel(Chem.MolToMolBlock( vicky.minimized_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'goldCarbon', 'opacity': 0.5}})
viewer.addModel(Chem.MolToMolBlock( wicky.minimized_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'turquoiseCarbon', 'opacity': 0.5}})
vicky.monster.positioned_mol
viewer.zoomTo()
viewer.show()
Notes for future Teo
Monster.mmff_minimize
is called by Wickor._calculate_thermo_common
and Victor.preminimized_undummied_mol
The reason why the former does not simply call the latter is because the neighbourhood is also scored.
The error ERROR - MMFF cannot work on a molecule that has errors!
comes from a None returned value for AllChem.MMFFGetMoleculeProperties(combo, 'MMFF94')
.
The verbosity can be changed via AllChem.MMFFGetMoleculeProperties(combo, 'MMFF94', mmffVerbosity=2)
.
This spits out a table, which is cool. AtomType 0 is an error. The atom indices are fortran-style.
bad_boi = combo.GetAtomWithIdx( 29 -1 )
bad_boi.SetAtomicNum(34)
frags = AllChem.GetMolFrags(neighborhood, asMols=True)
display(Draw.MolsToGridImage(frags))
Unfortunately, the returned MMFFMolProperties object cannot be made in any other way for debug.
from rdkit import ForceField as FF
FF.MMFFMolProperties() # RuntimeError: This class cannot be instantiated from Python
So p: FF.MMFFMolProperties = AllChem.MMFFGetMoleculeProperties(...)
is the only way. However, this object is cool as it can be queried p.GetMMFFAtomType(1)
, so might be useful elsewhere!
This glitch does not affect https://www.blopig.com/blog/2023/11/the-workings-of-fragmensteins-rdkit-neighbour-aware-minimisation/
Thanks! Will pull 0.14.6
and go from there.