Bill Tatsis, Matt Seddon, Dan Mason, Dan O'Donovan, Gio Cincilla, Azedine Zoufir and Nath Brown
Lig3DLens performs the following tasks:
- Prepares a commercial compound library to be used for a VS campaign. This task involves: i) compound standardisation and ii) filtering out compounds outside a predefined range of physicochemical properties.
- Generates conformers for all of the compounds in the commercial library and calculates their 3D similarity (shape & electrostatics) to a reference compound.
- Finally, it can cluster the highest scoring hits and select a set number of representative compounds that can be ordered and tested.
python -m pip install -r requirements.txt .
- Prepare a chemical library for a 3D VS campaign
Note In order to keep track of the library cmpds the input file should have a column containing the text "ID"
lig3lens-prepare --in input_SD_file --filter physchem_yaml_file --out output_SD_file
- Generates 3D conformers for both the library and reference compounds and scores the library compounds using a 3D shape & electrostatics similarity function to the reference molecule
Note In order to keep track of the library cmpds the input file should have a column containing the text "ID"
lig3dlens-align --ref input_reference_molecule_file --lib input_library_file_name --conf num_conformers --out output_SD_file
- Clusters the highest scoring molecules and selects a representative (diverse) set of compounds. The user can input the number of clusters (
num_clusters
), the fingerprint type (fingerprint_type
) and its dimension (fingerprint_dimension
) used for the clustering.
lig3dlens-cluster –-in input_SD_file –-clusters num_clusters –-out output_file -–dim fingerprint_dimension -–fp_type fingerprint_type
Use the Makefile commands to help tidy the codebase periodically. The following will reformat the code according to PEP8, and logically sort the imported modules:
make tidy
Run pytest in lig3dlens directory
pytest tests
Note: The whole VS workflow was tested in a linux (ubuntu) environment and this environment variable had to be set: Tell MKL (used by NumPy) to use the GNU OpenMP runtime instead of the Intel OpenMP runtime by setting the following environment variable:
export MKL_THREADING_LAYER=GNU
-
Compound library preparation:
-
Compound selection:
- Multi-parameter selection of compounds using a score function that includes the 3D score, 2D similarity to the reference compound, and the physchem properties. The aim is to get an even distribution between highly scored cmpds and other properties.
- Select an optimal number of clusters instead of a predefined one (e.g. using Silhouette or affinity propagation methods). Alternatively, using another method for maximum score-diversity selection problem (e.g. Score Erosion algorithm).
- Provide tools to analyse the chemical diversity of the final selection compound set.