SmartDataAnalytics/BioKEEN

Applications to chemogenomics data

cthoyt opened this issue · 0 comments

The EXCAPE-DB (manuscript, data download )is the easiest database to use with chemogenomic data - it's actually the pinacle of curation and preprocessing.

Until now, I've asked students to work on this but they never realized how important it was, so I will finish the corresponding bio2bel repository myself and then we will have the best data set for this that exists.

The thing is, it's very important to consider the IC50 values associated with each edge. How would that work in to the available models, if even at all? Assigning a hard cutoff is not a good idea, since it would throw away incredible amounts of information. Maybe we could bin, but then we would have to introduce some sort of notion of ordering of edges into the model as well.