/scrubber

Enumerate conformational, protomeric, and pH-related microstates for docking with AutoDock

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Scrubber

Process large numbers of small molecules for docking with AutoDock. May be useful for structure-based modeling in general.

What happens:

  • generate 3D coordinates using RDKit's ETKDGv3 and UFF minimization
  • enumerate tautomers (aiming at low energy states only)
  • enumerate pH corrections
  • convert boats to chairs (6-member rings) and enumerate both chair states
  • enumerate chiral centers (not implemented right now)

Installation

conda activate <desired-environment>    # if you are using conda environments

git clone git@github.com:forlilab/scrubber.git
cd scrubber
pip install -e .

Depends on the RDKit, which can be installed from conda-forge in the desired environment:

conda activate <desired-environment>
conda install rdkit -c conda-forge

Python scripting

from rdkit import Chem
from scrubber import Scrub

scrub = Scrub(
    ph_low=7.4,
    ph_high=7.4,
)

mol = Chem.MolFromSmiles("Clc1c(OCCC3)c3ccc1C(=O)Nc2nc[nH]c2")

# each state (e.g. tautomer) an rdkit mol and may have multiple conformers
for mol_state in scrub(mol):
    print(Chem.MolToSmiles(mol_state), "nr conformers: %d" % mol_state.GetNumConformers())

Command line tool examples

scrub.py "c1cc[nH]c(=O)c1" -o scrubbed.sdf --pH 5 --skip_gen3d
scrub.py input_mols.sdf -o scrubbed.sdf
scrub.py input_mols.smi -o scrubbed.sdf

Other options described in the help message:

scrub.py -h

Where "input_mols.smi" can look like this:

CC(=O)O aceticacid
CN(C)C trimethylamine 
Clc1cc(O)ccc1C(=O)Nc2nc[nH]c2 hello_mol
c1cccc1 rdkit_will_cry
CCC good4bbq
CCO alsogood4bbq
c1cccnc1CC(=O)C a_ketone