K-BiOnt: Biomedical Relation Extraction with Knowledge Graph-based Recommendations

The K-BiOnt system integrates KGs into biomedical RE through a recommendation model to further improve their range of action. This system adopts a baseline state-of-the-art deep biomedical RE system (BiOnt) with an existing KG-based recommendation state-of-the-art system (TUP) to perform biomedical RE for different entities, such as genes, phenotypes, diseases, and chemical compounds.

Our academic paper which describes K-BiOnt in detail can be found here.

Downloading Pre-Trained Weights

Available versions of pre-trained models are as follows:

The training details are described in our academic paper.

Getting Started

Our project includes code adaption of the TUP model available here, which is not an open-source project. To produce the models described above, one must implement both TUP and BiOnt.

However, to make inferences on the models produced or access the processed datasets/knowledge graphs, you can check the following sections, use the K-BiOnt Image available at Docker Hub, and git clone the TUP repository into the docker container to setup the rest of the experimental environment.

Preprocessed Datasets

PGR-crowd (original and preprocessed)
DDI Corpus (original and preprocessed)
BC5CDR (original and preprocessed)

Preprocessed Knowledge Graphs/Ontologies

HPO (original and preprocessed)
ChEBI (original and preprocessed)
DO (original and preprocessed)

Predict New Data

You need to preprocess your original data to make predictions based on the existing models. Afterwards, you can get the predictions based on the joined outputs of BiOnt and TUP. Using the preprocessed models, the system only supports three different types of relations DRUG/CHEMICAL-DRUG/CHEMICAL, HUMAN PHENOTYPE-GENE, and CHEMICAL-DISEASE. Check data/ to see a sample of the supported data formats for each pair type. You can make your predictions referencing the following example:

$2: pair_type
$3: data_to_test
$4: biont_model_name
$5: tup_model_name

Example:

 python3 src/predict.py DRUG-DRUG data/drug_drug/sample drug_drug_model_ontologies drug_drug-transup

For more options check predict.sh.

If your testing data is a small dataset with less than ten different entities, you should consider, when interpreting the final output, that TUP predictions will be skewed to search for the correct answer in the TOP@10.

Reference

Diana Sousa and Francisco M. Couto. 2022. Biomedical Relation Extraction with Knowledge Graph-based Recommendations. IEEE Journal of Biomedical and Health Informatics.

lasigeBioTM/K-BiOnt