Open Targets Tractability Pipeline

Introduction

This pipeline has been developed to produce tractability data for list of input Ensembl Gene IDs. This implementation is based on the public version of the GSK tractability pipeline, published here

The pipeline produces a TSV file with one target per row. Small molecule tractability buckets are denoted with "Bucket_X", antibody buckets with "Bucket_X_ab" and PROTAC buckets with "Bucket_X_PROTAC".

In addition to PROTAC tractability buckets, there is an additional "PROTAC_location_Bucket", which allows you to assess whether a target's location is suitable for the PROTAC approach.

High confidence good location
Med confidence good location
High confidence grey location
Med condifence grey location
Unknown location
Med confidence bad location
High confidence bad location

Installation

Change to the directory containing this file

pip install .

Install cxoracle

Set the following environment variables:

CHEMBL_DB=oracle://address:to@local.chembl

CHEMBL_VERSION=25

Getting Started

Run the pipeline with the following command:

run-ot-pipeline genes.csv

Where genes.csv is a file with one Ensembl Gene ID per line with no headers

chembl/tractability_pipeline

Open Targets Tractability Pipeline

Introduction

Installation

Getting Started