Exploring Data Representations

Keywords: Graphs, Data Formats, Structured data

Need: Explore the best representations to gain insight from electronic health records (EHR) and other healthcare data sources. Using Graph Databases/Algorithms and process mining approaches to better explore EHR data. Other modern approaches for encoding sequential information would be good to explore e.g. graph embeddings, entity embeddings, etc.

Current Knowledge/Examples & Possible Techniques/Approaches: Recent work using graph models to build simpler knowledge discovery systems e.g. to model electronic health records for improvements in diagnostic prediction, or to understand negative drug interactions. This is achieved by storing the information in a format that is closer to reality. Graph Data Structures, Embeddings, Deep Neural Networks. Scalable and accurate deep learning with electronic health records

Related Previous Internship Projects: n/a as first year of the scheme.

Enables Future Work: These both constitute important steps towards better features for ML/AI solutions such as the recent interest in Graph Neural Networks. Supports work looking at interoperability.

Outcome/Learning Objectives: Demonstration of transforming EHR data to Graph based and the value this has for example pieces of analysis.

Datasets: Open healthcare datasets with relevant structures with a view to extend to other datasets if successful

Desired skill set: When applying please highlight any experience around informatics, graph structures, deep learning, coding experience (including any coding in the open), any other data science experience you feel relevant.