/FPSim2

Simple package for fast molecular similarity searches

Primary LanguagePythonMIT LicenseMIT

CI Testing License: MIT Conda Downloads Downloads DOI

FPSim2: Simple package for fast molecular similarity searches

FPSim2 is a small NumPy centric Python/C++ RDKit based package to run fast compound similarity searches. FPSim2 performs better with high search thresholds (>=0.7). Currently used in the ChEMBL and SureChEMBL interfaces.

Highlights:

  • Using CPU POPCNT instruction
  • Bounds for sublinear speedups from 10.1021/ci600358f
  • A compressed file format with optimised read speed based in PyTables and BLOSC
  • Fast multicore CPU and GPU similarity searches
  • In memory and on disk search modes
  • Distance matrix calculation

Installation

pip install fpsim2

or

conda install -c conda-forge fpsim2

Documentation

Documentation is available at https://chembl.github.io/FPSim2/

Trying it online

To try out FPSim2 interactively in your web browser, just click on the binder icon Binder