openforcefield/openff-toolkit

Partial Charges for Rare Atoms

maciejwisniewski-drugdiscovery opened this issue · 1 comments

Hi,

I'm working with a dataset containing SMILES strings of various ligands, and I want to calculate partial charges for each of them using either the Gasteiger or AM1BCC method.

Here is the Python code I'm using:

import rdkit
from rdkit import Chem
from rdkit.Chem import rdDistGeom
from openff.toolkit import Molecule

smiles = 'C[AsH]C' #  example SMILES

charge_method = 'gasteiget' #  or 'am1bcc'

try:
    rdkit_mol = Chem.MolFromSmiles(smiles)
    # Generate Conformer
    rdkit_mol.RemoveAllConformers()
    rdkit_mol = Chem.AddHs(rdkit_mol)
    params = rdDistGeom.ETKDGv3()
    params.randomSeed = 0xd06f00d
    params.numThreads = 4
    params.maxAttempts = 100
    rdDistGeom.EmbedMultipleConfs(rdkit_mol, 1, params)
    # Load to OpenFF
    openff_mol = Molecule.from_rdkit(rdkit_mol,allow_undefined_stereo=True)
except:
    openff_mol = Molecule.from_smiles(smiles,allow_undefined_stereo=True)
    openff_mol.generate_conformers(n_conformers=1)

openff_mol.assign_partial_charges(partial_charge_method='gasteiger')
print(openff_mol.partial_charges)

However, I've encountered a couple of issues:

1. Ligands with Radicals (?):

  • Example SMILES: [S]12[Fe]3[S]4[Fe]1[S]1[Fe]2[S]3[Fe]41
  • Ligand ID: SF4
  • Error:
[15:20:31] unrecognized bond type[15:20:31] unrecognized bond type[15:22:38] UFFTYPER: Unrecognized charge state for atom: 0
[15:22:38] UFFTYPER: Unrecognized atom type: Fe5+2 (1)
[15:22:38] UFFTYPER: Unrecognized charge state for atom: 2
[15:22:38] UFFTYPER: Unrecognized atom type: Fe5+2 (3)
[15:22:38] UFFTYPER: Unrecognized charge state for atom: 4
[15:22:38] UFFTYPER: Unrecognized atom type: Fe5+2 (5)
[15:22:38] UFFTYPER: Unrecognized charge state for atom: 6
[15:22:38] UFFTYPER: Unrecognized atom type: Fe5+2 (7)
RadicalsNotSupportedError: The OpenFF Toolkit does not currently support parsing molecules with S- and P-block radicals. Found 1 radical electrons on molecule [S]12[Fe]3[S]4[Fe]1[S]1[Fe]2[S]3[Fe]41.

Do you have any ideas how to omit that error?

2. Ligands with Atoms Lacking Parameters in Charge Calculation Methods:

List of Not Working Atoms in my DataSet

bad_atoms = ['Se','Fe','Ca','W','V','Co','As','Be','Mo','Te','Sb','Hg','Cr','Cu','Pd','Tb','Pr','Pb','Ir','Rh','Pt','Sn','Ni','Au','Zn','U','Ru','Cu']
  • Example SMILES: CCC1=C(C)C2=Cc3c(CC)c(C)c4n3[Rh@SP3]35n6c(c(C)c(CCC(=O)O)c6=CC6=[N+]3C(=C4)C(C)=C6CCC(=O)O)=CC1=[N+]25
  • Ligand ID: SF4
  • Error for Gasteigert:
[15:29:24] UFFTYPER: Unrecognized atom type: Rh3 (15)
  • Error for am1-bcc:
ValueError: No registered toolkits can provide the capability "assign_partial_charges" for args "()" and kwargs "{'molecule': Molecule with name '' and SMILES '[H][O][C](=[O])[C]([H])([H])[C]([H])([H])[C]1=[C]([C]([H])([H])[H])[C]2=[C]([H])[C]3=[N+]4[C](=[C]([H])[C]5=[C]([C]([H])([H])[C]([H])([H])[H])[C]([C]([H])([H])[H])=[C]6[C]([H])=[C]7[C]([C]([H])([H])[H])=[C]([C]([H])([H])[C]([H])([H])[C](=[O])[O][H])[C]8=[N+]7[Rh@@]4([N]2[C]1=[C]8[H])[N]65)[C]([C]([H])([H])[H])=[C]3[C]([H])([H])[C]([H])([H])[H]', 'partial_charge_method': 'am1bcc', 'use_conformers': None, 'strict_n_conformers': False, 'normalize_partial_charges': True, '_cls': <class 'openff.toolkit.topology.molecule.Molecule'>}"
Available toolkits are: [ToolkitWrapper around The RDKit version 2024.03.4, ToolkitWrapper around AmberTools version 22.0, ToolkitWrapper around Built-in Toolkit version None]
 ToolkitWrapper around The RDKit version 2024.03.4 <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : partial_charge_method 'am1bcc' is not available from RDKitToolkitWrapper. Available charge methods are {'mmff94': {}, 'gasteiger': {}}
 ToolkitWrapper around AmberTools version 22.0 <class 'openff.toolkit.utils.exceptions.ConformerGenerationError'> : RDKit conformer generation failed.
 ToolkitWrapper around Built-in Toolkit version None <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : Partial charge method "am1bcc"" is not supported by the Built-in toolkit. Available charge methods are {'zeros': {'rec_confs': 0, 'min_confs': 0, 'max_confs': 0}, 'formal_charge': {'rec_confs': 0, 'min_confs': 0, 'max_confs': 0}}```
 
* Output:

array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan]) <Unit('elementary_charge')>


Is there any chance to obtain charges for ligands with these specific atoms?

Maciek

Hi @maciejwisniewski-drugdiscovery,

Unfortunately, we don't currently support loading/handling transition metals (which I believe is that issue causing the "radicals" error with the iron-containing compound in part 1), and even if we did, the AM1BCC charge method does not support the elements in your bad_atoms list from item 2. Currently OpenFF focuses on supporting druglike organic molecules, and while we have plans to expand our domain of applicability in the coming years, it will be a few years before we can generally handle transition metals.

Cheers,
Jeff