Utilities using RDKit.
Language: Python
Start: 2021
I collected here a series of utilities using RDKit (Open-Source Cheminformatics Software, version originally used: Q32017):
- calcCentroidFingerprint: calculate the centroid of a set of fingerprints
- calcDescriptors: calculate some basic molecular descriptors
- drawMolecule: draw a molecule (even with atom index)
- exportExcelMolImage: export a Pandas dataframe with molecule image to Excel
- printAtomBond: print atoms and bonds of a molecule
- printChiral: print chirality information of a molecule
- printPeriodicTable: print periodic table information
- sanitizeMol: sanitize a SMILES molecule
- sanitizeMolFile: sanitize a molecular file
- testBulkTanimotoSimilarity: test of BulkTanimotoSimilarity function while working on different Pandas dataset of molecules
- testAddMoleculeColumnToFrame: test of AddMoleculeColumnToFrame function to add a Molecule column to a Pandas dataframe (also molecules shown using Draw.MolsToGridImage)
- testMaxMinPicker: test of diversity and similarity pickers functions (e.g., HierarchicalClusterPicker)
- testSmilesMolSupplier: test of the SmilesMolSupplier function
mySMILES = ['CCCCF', 'C1CCCCC1', 'C(=O)CN']
mols = [Chem.MolFromSmiles(SMILES) for SMILES in mySMILES]
fps = [Chem.rdMolDescriptors.GetMorganFingerprintAsBitVect(m, 3, 512) for m in mols]
from rdkit.Chem import rdMolDescriptors
import rdkit.Chem.Descriptors as Descriptors
MW = Descriptors.ExactMolWt(mol)
from rdkit.Chem import Draw
Draw.MolToFile(mol, IMAGE_FILE_NAME, size=(1024, 768), fitImage=True)
Draw.MolsToGridImage(mols, molsPerRow=3, subImgSize=(400, 400))
bonds = mol.GetBonds()
for i in range(len(bonds)):
bond = bonds[i]
a1 = bond.GetBeginAtomIdx()
a2 = bond.GetEndAtomIdx()
atom1 = mol.GetAtomWithIdx(a1)
atom2 = mol.GetAtomWithIdx(a2)
mol = Chem.MolFromSmiles(SMILES, sanitize = True)
Chem.MolToSmiles(mol)
Chem.CanonSmiles(mol)
mol1 = Chem.MolFromSmiles('Nc1ccccc1')
mol2 = Chem.MolFromSmiles('c1c(N)cccc1')
print(Chem.MolToSmiles(mol1) == Chem.MolToSmiles(mol2))
df['MolImage'] = [Chem.MolFromSmiles(s) for s in df['SMILES']]
PandasTools.SaveXlsxFromFrame(df, 'export.xlsx', molCol='MolImage', size=(100, 200))
IMPORTANT: XlsxWriter needs to be installed (pip install XlsxWriter)