matteoferla/molecular_rectifier

The 4n+2 rule

Opened this issue · 0 comments

There are too many ad hoc cases that are meant to catch generic errors but mainly happen due to violations of the 4n+2 rule for aromaticity.

From a simplistic point of view, this would detect if a set of bonds composing a ring and its substituent is aromatic:

def obeys_4n2(mol: Chem.Mol, bonds: Sequence[Chem.Bond]):
    # 4N+2 rule
    is_aromatic_bond = lambda bond: bond.GetBondType() in (Chem.BondType.AROMATIC, Chem.BondType.DOUBLE)
    pis = len(tuple(filter(is_aromatic_bond, bonds)))
    return (pis - 2) % 4 == 0

The double bond in there for substituents is for xanthine and friends.
Expanding the list of bonds to include the substituents is easy:

bonds: List[Chem.Bond] = list(map(mol.GetBondWithIdx, bond_idxs))
exbond_idxs: Set[int] = set()
for bond in bonds:
    for atom in (bond.GetBeginAtom(), bond.GetEndAtom()):
        exbond_idxs.update(map(Chem.Bond.GetIdx, atom.GetBonds()))
exbonds: List[Chem.Bond] = list(map(mol.GetBondWithIdx, exbond_idxs))

But this does not include other rings, so the bonds should include all in the conjugation system. Think cyanine dyes... That is too complicated.
Plus charged/protonated atoms are not considered (e.g. indole). This is currently done in a rather random manner.

Ideally, were this module to be perfect this would need to be considered. For now this will not be done.