/RetroBioCat

Primary LanguageJavaScriptMIT LicenseMIT

RetroBioCat

DOI

RetroBioCat is a web-based tool for designing biocatalytic cascades and reactions.

We recommend using retrobiocat through the online version hosted at https://retrobiocat.com

However, you may run your own instance of RetroBioCat by following the installation instructions below. Currently only an example data-set is provided for the substrate specificity database.

For more information, please see our preprint:
Finnigan, William; Hepworth, Lorna J.; Turner, Nicholas J.; Flitsch, Sabine (2020): RetroBioCat: Computer-Aided Synthesis Planning for Biocatalytic Reactions and Cascades. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12571235.v1

Requirements and testing

python = 3.7
rdkit >= 2020 tensorflow >= 2.1.0
Python packages listed in requirements.txt. The lastest version of each package is recommended other than where specified.

Retrobiocat also requires a running mongodb and redis instance, for which we recommended using docker.

Retrobiocat has been tested on macOS v10.14.6 and Ubuntu 18.04.3 (LTS).

Installation should take no longer than 10-30 minutes on a modern computer.

Option 1 - Use docker-compose

Warning - currently this method does not function correctly on Windows due to an issue with specifying a volume for use with the mongo container.

  • Clone this repository and move working directory to /retrobiocat/docker/
git clone https://github.com/willfinnigan/retrobiocat.git 
cd retrobiocat/docker/
  • Build the docker containers
docker-compose build --no-cache
docker-compose up

RetroBioCat should now be available locally at http://127.0.0.1:5000
Databases must now be initalised (see below)

Option 2 - Manual Installation

RetroBioCat requires anaconda or miniconda with python 3.7 or later
You may wish to install retrobiocat in a virtual environment to prevent conflicting dependencies.

  • First install the following conda packages
conda install -c rdkit rdkit -y
  • Next, clone this repository, either with the git command below or through the link on github
git clone https://github.com/willfinnigan/retrobiocat.git 
  • Install retrobiocat_web along with requirements.
pip install -e .

2b. Running redis and mongodb

RetroBioCat requires access to a redis server and a mongo database on the default ports.
We recommend using docker to run redis and mongodb.

To run redis using docker:

docker run -d -p 6379:6379 redis

To run mongodb using docker:

docker run -d -p 27017-27019:27017-27019 mongo:4.0.4

2c. Running RetroBioCat

To run the RetroBioCat website, two python scripts are required.
From the retrobiocat directory, run (in separate terminals):

python retrobiocat_web/main.py
python retrobiocat_web/worker.py

RetroBioCat should now be available locally at http://127.0.0.1:5000

Initialiase the database (required for both methods of installation)

Before your local version of RetroBioCat can be used, the databases it relies on must be set up.

To do this, first login using the default admin account:

Navigate to the Initialise Database page in the admin menu.

Initialise the database by uploading the required files. This can be done one at a time (recommended) or all together.

Files are available at:
https://figshare.com/articles/software/RetroBioCat_database_files/12696482

Reaction rules: rxns_yaml.yaml
Activity: trial_biocatdb_will_and_lorna.xlsx
Building blocks: building_blocks.db

Currently only an example set substrate specificity information is provided, pending future publications.

Once the databases are initialised RetroBioCat is ready to use.

Automated testing of pathway test-set

Our publication on RetroBioCat features an evaluation on a test-set of 52 pathways.
We automated this evaluation using a script available in the /scripts/pathway_testing/ folder.

To run the pathway_eval.py script, install retrobiocat via option 2 (above) and ensure that your mongodb instance is running and that the databases have been initialised as described above.

Move directories to /scripts/pathway_testing/ , and run python pathway_eval.py

Note this script takes a long time to run. Results are saved by default to test_pathways.xlsx

(Note, replication of the results in the paper requires the complete set of reaction rules and database file, which are not yet publicly available)