/Series4_PredictiveModel

Can we Predict Active Compounds in OSM Series 4?

Abstract

The discovery of new antimalarial medicines with novel mechanisms of action is key to combating the increasing reports of resistance to our frontline treatments. The Open Source Malaria (OSM) consortium have been developing compounds ("Series 4") which possess potent activity against Plasmodium falciparum in vitro and in vivo and have been suggested to act through the inhibition of PfATP4, an essential ion pump in the parasite membrane that regulates intracellular Na+ and H+ concentrations. This pump has not yet been crystallised, so in the absence of structural information about this target, a public competition was created to develop a model that would allow us to predict when compounds in Series 4 are likely to be active.

In the first round in 2016, six participants used the open data collated by OSM to develop moderately predictive models using diverse methods. Notably all submitted models were available to all other participants in real time. Since then further bioactivity data have been acquired and machine learning methods have rapidly developed, so a second round of the competition is now underway. The best-performing models from this second round will be used to predict novel analogs in Series 4 that will be synthesised and evaluated against the parasite. As such the project will openly demonstrate the abilities of new machine learning algorithms in the prediction of active compounds where there is no confirmed target, frequently the central problem in phenotypic drug discovery.

Detailed Abstract: Can we Predict Active Compounds in OSM Series 4?

The Series 4 triazolopyrazines are a promising series of compounds that originate from a HTS at Pfizer. Compounds in the series have been shown to possess high potency (down to 16 nM). Additionally, two compounds have show efficacy in vivo. Inherited information for Series 4 and follow-up studies, implicate the mechanism of action for Series 4 is via inhibition of the malaria parasites ability to regulate its intracellular Na+ concentration using a P-type Na+-ATPase transporter (PfATP4). There is correlation between in vitro potency and PfATP4 activity.

We need to develop ways to predict new and potent Series 4 compounds in the absence of structural information of PfATP4. Initial attempts were made by Murray Robertson in 2015, and a follow-up competition was launched in 2016. This competition was designed to involve and utilise the expertise of the scientific community. There were 6 submissions by the end of the competition, and two equally well-performing models (created by Ho-Leung Ng and James McCulloch) were awarded prizes.

A subsequent round is currently on-going with the aim of involving companies and individuals specialising in AI and machine learning.

If you would like to participate, there is still time. All the relevant information can be found in either the Wiki or this issue. If you have any questions, please post on the issue.

What is currently needed for the paper?

The paper is being written up in the GoogleDoc here. Please edit accordingly.

  • Abstract is as above. Edits discussed here
  • The basis for the introduction has been written up but still needs expanding in some areas.
  • A very brief description of Round 1 is there but more information is required about the details of each of the original submitted models.
  • A graphical summary of the judging and results of the Round 1 submissions needs to be created.
  • The easiest thing is to be writing up anything that's happening in Round 2 as we go. If you have been participating in this round, it would be great if you could add anything relevant to the paper that describe your models/methods.
  • Conclusions need writing.