/BinaryStringClassifier

Binary string classification of SMILES with an LSTM

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

BinaryStringClassifier 🔣

An exploration into SMILES detection using a LSTM neural network.

Further discussion can be found in the associated blog post

Quick Start

# Install
pip install -r requirements.txt
python3 setup.py install

# View help
BSC-Data -h
BSC-Model -h

Functionality

  • BSC-Data: Entrypoint for dataset creation and curation.
    • create: Create randomly generated datasets within given parameters.
    • combine: Combine multiple components datasets into one ready for training.
    • evaluate: Generate summary information for given a dataset.
    • split: Perform train/test split for a given dataset.
  • BSC-Model: Entrypoint for model training & evaluation.
    • train: Train a BinaryStringClassifier model.
    • evaluate: Evaluate a trained BinaryStringClassifier model.
    • predict: Get SMILES probabilities with a trained BinaryStringClassifier model

Example Plots

Accuracy by Epoch Loss by Epoch
Confusion Matrix ROC Curve