The 4n+2 rule
Opened this issue · 0 comments
There are too many ad hoc cases that are meant to catch generic errors but mainly happen due to violations of the 4n+2 rule for aromaticity.
From a simplistic point of view, this would detect if a set of bonds composing a ring and its substituent is aromatic:
def obeys_4n2(mol: Chem.Mol, bonds: Sequence[Chem.Bond]):
# 4N+2 rule
is_aromatic_bond = lambda bond: bond.GetBondType() in (Chem.BondType.AROMATIC, Chem.BondType.DOUBLE)
pis = len(tuple(filter(is_aromatic_bond, bonds)))
return (pis - 2) % 4 == 0
The double bond in there for substituents is for xanthine and friends.
Expanding the list of bonds to include the substituents is easy:
bonds: List[Chem.Bond] = list(map(mol.GetBondWithIdx, bond_idxs))
exbond_idxs: Set[int] = set()
for bond in bonds:
for atom in (bond.GetBeginAtom(), bond.GetEndAtom()):
exbond_idxs.update(map(Chem.Bond.GetIdx, atom.GetBonds()))
exbonds: List[Chem.Bond] = list(map(mol.GetBondWithIdx, exbond_idxs))
But this does not include other rings, so the bonds should include all in the conjugation system. Think cyanine dyes... That is too complicated.
Plus charged/protonated atoms are not considered (e.g. indole). This is currently done in a rather random manner.
Ideally, were this module to be perfect this would need to be considered. For now this will not be done.