/scikit-mol

scikit-learn classes for molecular vectorization using RDKit

Primary LanguagePythonApache License 2.0Apache-2.0

scikit-mol

scikit-learn classes for molecular vectorization using RDKit

TODO: Expand number of fingerprint classes and featurizers AtomPairs TopologicalTorsions RDKit Descriptors LINGOS ...

Make dictionary based FP class
    No Hashing, .fit() learns the keys of the dataset

Make a basic standardarizer transformer class

Make a SMILES to Mol transformer class

Make Notebook with examples Standalone usage Inclusion in pipeline Can transformers be used in parallel (e.g. to use both FP features and Descriptors at the same time?) Hyperparameter optimization via native Scikit-Classes Hyperparameter optimization via external optimizer e.g. https://scikit-optimize.github.io/stable/

Make basic unit-tests

Installation git clone git@github.com:EBjerrum/scikit-mol.git pip install -e .

Contributers: Esben Bjerrum, esben@cheminformania.com Son Ha, sonha@uni-mainz.de Please add yourself here, we'll properly markdown it later