molecularinformatics/roshambo

Info about benchmark runs (CXCR4 and CSF1R)

Closed this issue · 4 comments

Hi, would it be possible to share input files and details used in the benchmark runs? Thanks :)

Here is a the code snippet used to do the runs:

for folder in dud_folders:
    print("{}ligands starting...".format(folder))
    get_similarity_scores(
        ref_file="query.sdf",
        dataset_files_pattern="ligands.smi",
        ignore_hs=True,
        n_confs=60,
        keep_mol=True,
        random_seed=109838974,
        opt_confs=True,
        calc_energy=True,
        energy_iters=300,
        energy_cutoff=30,
        align_confs=True,
        rms_cutoff=0.1,
        num_threads=46,
        method="ETKDGv3",
        volume_type="analytic",
        n=2,
        epsilon=0.5,
        use_carbon_radii=True,
        color=True,
        max_conformers=1,
        sort_by="ComboTanimoto",
        write_to_file=True,
        gpu_id=0,
        working_dir="{}/ligands".format(folder),
        #smiles_kwargs={"delimiter": "\t"},
    )

A few notes:

  • We downloaded the smiles files (ligands.smi and decoys.smi for each target) from the Charged_Matched_DUDE folder from https://dudez.docking.org/
  • The above loop is just for the ligands, you would need to run it again and change the working directory/dataset file for the decoys
  • The last (commented out) option is important! Some of the .smi files are space-delimited, which the above command will work for. If the .smi file is tab-delimited, you will need to uncomment out the last option to tell the file parser that.
  • In case there are macrocyles in the underlying .smi file (like for TRY1 and XIAP), you will need to manually remove those
  • If you run out of memory on your GPU, you will need to manually batch the inputs, or decrease the number of conformers. We're working on making ROSHAMBO faster and more memory efficient now!

Thanks!I'll give it a spin 🐡

A quick follow-up question, is query.sdf a compound from the ligands.smi file?

Good point - for each target, it is the xtal-lig.pdb file (converted into an .sdf file) found in the DOCKING_GRIDS_AND_POSES folder of dudez.docking.org