Dominic Habgood-Coote, Clare Wilson, Chisato Shimizu, Anouk M Barendregt, Ria Philipsen, Rachel Galassini, Irene Rivero Calle, Lesley Workman, Philipp Agyeman, Gerben Ferwerda, Suzanne T Anderson, Merlijn van den Berg, Marieke Emonts, Enitan D Carrol, Colin G Fink, Ronald de Groot, Martin L Hibberd, John Kanegaye, Mark P Nicol, Stéphane Paulus, Andrew J Pollard, Antonio Salas, Fatou Secka, Luregn J Schlapbach, Adriana H Tremoulet, Michael Walther, Werner Zenz, Michiel Van der Flier, Heather J Zar, Taco Kuijpers, Jane C Burns, Federico Martinón-Torres, Victoria J Wright, Lachlan JM Coin, Aubrey J Cunnington, Jethro A Herberg, Michael Levin*, Myrsini Kaforou*
* authors contributed equally. Correspondence m.kaforou@imperial.ac.uk
Appropriate treatment and management of children presenting with fever depend on accurate and timely diagnosis, but current diagnostic tests lack sensitivity, specificity and are frequently too slow to inform initial treatment. As an alternative to pathogen detection, host gene expression signatures in blood have shown promise in discriminating several infectious and inflammatory diseases in a dichotomous manner. However, differential diagnosis requires simultaneous consideration of multiple diseases. Here we show that diverse infectious and inflammatory diseases can be discriminated by the expression levels of a single panel of genes in blood.
A multi-class supervised machine learning approach, incorporating clinical consequence of misdiagnosis as a “cost” weighting, was applied to a dataset made up of 12 publicly available whole blood gene expression microarray datasets including 1,212 children with 18 infectious or inflammatory diseases. Data were divided into training (75%) and test (25%) sets. The transcriptional panel was validated in a newly generated RNA-Sequencing dataset comprising 411 febrile children.
We identified 161 transcripts that classified patients into 18 disease categories, reflecting individual causative pathogen and specific disease; as well as highly confident/reliable prediction of broad classes comprising bacterial and viral infection, malaria, tuberculosis or inflammatory disease. The transcriptional panel was validated in an independent cohort and benchmarked against existing dichotomous RNA signatures.
Our data suggest classification of febrile illness can be achieved on a single blood sample and open the way to a new approach for clinical diagnosis.
European Union's Seventh Framework No. 279185; Horizon2020 No. 668303 PERFORM; Wellcome Trust (206508/Z/17/Z); Medical Research Foundation (MRF-160-0008-ELP-KAFO-C0801); NIHR Imperial BRC.