salesforce/provis

Get the report and plots for BindingDB dataset

Closed this issue · 1 comments

Hi.

Thanks for your great paper and package.
Could you please kindly let me know how I can regenerate the reports for other datasets such as BindingDB, KIBA, and DAVIS dataset?

These are protein datasets for drug-target interaction problems.

Hi @behroozazarkhalili, thanks for your question. The attention analysis applies to two types of protein properties: (1) properties of a single amino acid (e.g. whether it is at a binding site) or (2) properties of pairs of amino acids within a single protein sequence, e.g. whether the two amino acids are in contact. I'm not familiar with the datasets you mentioned, but if they describe properties of that nature, then the approach could be applied to them.

If this is the case, you would need to extend a few python modules:

  • compute_edge_features.py, which is the main module for doing the attention analysis. You would need to extend the arguments and some of the logic to access your dataset, features, and possibly model if you don't want to use one of the existing models.
  • features.py, which is where properties for analysis are defined
  • datasets.py, which is where datasets are defined.

Let me know if you have any other questions.