matteoferla/Fragmenstein

NaN output for energy/RMSD using Wictor

Closed this issue · 4 comments

Versions:

Python 3.9.17 (main, Jul  5 2023, 20:41:20) 
[GCC 11.2.0]
on Linux x86_64
with Fragmenstein 0.14.5 and PyRosetta  2023.40+release.96fa3c5

When performing placements with Wictor:

def place_products(self) -> pd.DataFrame:
    """
    This function places products using Fragmenstein.
    """
    # set up Wictor
    self.setup_Fragmenstein()
    # Get pdbblock from template_path
    with open(self.template_path) as fh:
        pdbblock: str = fh.read()
    lab = Laboratory(pdbblock=pdbblock, covalent_resi=None, run_plip=False)
    placements: pd.DataFrame = lab.place(place_input_validator(self.input_df), n_cores=self.n_cores,
                                         timeout=self.timeout)
    return placements

def setup_Fragmenstein(self):
    """
    This function sets up Fragmenstein to run.
    """
    # Using Wictor to place just by RDKit minimisation
    # make output directory
    self.output_path: str = os.path.join(self.output_dir, 'output')
    os.makedirs(self.output_path, exist_ok=True)
    Wictor.work_path = self.output_path
    os.chdir(self.output_dir) # this does the trick
    Wictor.monster_throw_on_discard = True  # stop this merger if a fragment cannot be used.
    Wictor.monster_joining_cutoff = 5  # Å
    Wictor.quick_reanimation = False  # for the impatient
    Wictor.error_to_catch = Exception  # stop the whole laboratory otherwise
    Wictor.enable_stdout(logging.CRITICAL)
    #Wictor.enable_logfile(os.path.join(self.output_path, f'fragmenstein.log'), logging.ERROR)
    Laboratory.Victor = Wictor

I get this minimised.json output for all placements:
{"Energy": {"bound": {"total_score": NaN, "unit": "kcal/mol"}, "unbound": {"total_score": NaN, "unit": "kcal/mol"}, "ideal": {"total_score": NaN, "unit": "kcal/mol"}, "insitu": {"total_score": NaN, "unit": "kcal/mol"}, "apo": {"total_score": NaN, "unit": "kcal/mol"}}, "mRMSD": 0.0, "RMSDs": [0.0]}

Thanks for looking into it!

The Fragmenstein version is a hot-fix version (H issue) for Wictor... so I was worried there would be an issue!
I can replicate the error and it does happen only in Wictor.

import pkg_resources, sys, platform

get_version = lambda name: pkg_resources.get_distribution(name).version

print(f'Python {sys.version}\n'+
      f'on {platform.system()} {platform.machine()}\n'+
      f'with Fragmenstein {get_version("fragmenstein")} and PyRosetta  {get_version("pyrosetta")}'
     )

from typing import List
from rdkit import Chem
from fragmenstein import Wictor, Monster, Victor, Igor, demo
from IPython.display import display
from ipywidgets import Output


Victor.enable_stdout(40) # ERROR
Victor.capture_logs()

hit: Chem.Mol = demo.Mac1.get_mol('diamond-x0662_A')
#hits: List[Chem.Mol] = demo.Mac1.get_n_filtered_mols(5)
pdbblock: str = demo.Mac1.get_template()
display(hit)

# Monster
monstah = Monster([hit])
monstah.place_smiles(Chem.MolToSmiles(hit))
print('Monster:')
display(monstah.positioned_mol)

# Victor
with Output() as output:
    # There is no way to mute the licence stdout
    Igor.init_pyrosetta()
vicky = Victor([hit], pdb_block=demo.Mac1.get_template())
vicky.place(Chem.MolToSmiles(hit))
print('Victor:', vicky.summarize())
display(vicky.minimized_mol)

# Victor w/o PyRosetta
wicky = Wictor([hit], pdb_block=demo.Mac1.get_template())
wicky.place(Chem.MolToSmiles(hit))
print('Wictor:', wicky.summarize())
display(wicky.minimized_mol)

Gives me this (please ignore my usage of the Neon Jupyter theme):
image

The error [2024-02-07 09:01:11,772] ERROR - MMFF cannot work on a molecule that has errors! oddly happens with Wictor not Victor. I'll look into it now.

Fixed in 0.14.6. I will not upload it to pypi because I haven't got round to completing the migration from nglview to py3dmol.

The issues was to do with the neighbourhood not being sanitised before the hydrogens were added (in Monster.get_neighborhood), which I had changed to address in the hotfix, which was raising some side errors.
The values are not nan and are in the right order of magnitude, which is okay for Wictor, which is a crude hack.

Wictor: {'name': 'ligand', 'smiles': 'Cn1ccc(C(=O)Nc2cccc(C(=O)[O-])c2)n1', 'error': '', 'mode': 'expansion', '∆∆G': -12.346298146748609, '∆G_bound': 207.17850845226974, '∆G_unbound': 219.52480659901835, 'comRMSD': 0.4184665717167791, 'N_constrained_atoms': 18, 'N_unconstrained_atoms': 0, 'runtime': 0.3411738872528076, 'regarded': ['diamond-x0662_A'], 'disregarded': []}
Victor: {'name': 'ligand', 'smiles': 'Cn1ccc(C(=O)Nc2cccc(C(=O)[O-])c2)n1', 'error': '', 'mode': 'expansion', '∆∆G': -3.5929845730282706, '∆G_bound': -3.08374601698008, '∆G_unbound': 0.5092385560481905, 'comRMSD': 0.21786527437335607, 'N_constrained_atoms': 18, 'N_unconstrained_atoms': 0, 'runtime': 6.5042548179626465, 'regarded': ['diamond-x0662_A'], 'disregarded': []}

In terms of outcome, it's nearly identical:

import py3Dmol

viewer = py3Dmol.view(width=1600, height=350)

neighborhood = vicky.monster.get_neighborhood(vicky.apo_pdbblock, cutoff=vicky.settings['ff_neighborhood'],  addHs=True)
viewer.addModel(Chem.MolToMolBlock( neighborhood ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'gainsboroCarbon', 'opacity': 1}})

viewer.addModel(Chem.MolToMolBlock( monstah.positioned_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'goldCarbon', 'opacity': 0.5}})

viewer.addModel(Chem.MolToMolBlock( vicky.minimized_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'goldCarbon', 'opacity': 0.5}})

viewer.addModel(Chem.MolToMolBlock( wicky.minimized_mol ), 'sdf')
viewer.setStyle({'model':-1}, {'stick':{'colorscheme':'turquoiseCarbon', 'opacity': 0.5}})

vicky.monster.positioned_mol
viewer.zoomTo()
viewer.show()

image

Notes for future Teo

Monster.mmff_minimize is called by Wickor._calculate_thermo_common and Victor.preminimized_undummied_mol
The reason why the former does not simply call the latter is because the neighbourhood is also scored.

The error ERROR - MMFF cannot work on a molecule that has errors! comes from a None returned value for AllChem.MMFFGetMoleculeProperties(combo, 'MMFF94').

The verbosity can be changed via AllChem.MMFFGetMoleculeProperties(combo, 'MMFF94', mmffVerbosity=2).
This spits out a table, which is cool. AtomType 0 is an error. The atom indices are fortran-style.

bad_boi = combo.GetAtomWithIdx( 29 -1 )
bad_boi.SetAtomicNum(34)
frags = AllChem.GetMolFrags(neighborhood, asMols=True)
display(Draw.MolsToGridImage(frags))

Unfortunately, the returned MMFFMolProperties object cannot be made in any other way for debug.

from rdkit import ForceField as FF

FF.MMFFMolProperties() # RuntimeError: This class cannot be instantiated from Python

So p: FF.MMFFMolProperties = AllChem.MMFFGetMoleculeProperties(...) is the only way. However, this object is cool as it can be queried p.GetMMFFAtomType(1), so might be useful elsewhere!

Thanks! Will pull 0.14.6 and go from there.