HealthcareLake
HealthcareLake is a Terraform project that makes it easy to deploy a clinical research environment in minutes. We do this by deploying a serverless data lake, FHIR API and OMOP ETL job.
This repo contains a Terraform module for deploying the solution on AWS (./infra
). We import two other modules:
- HealthcareLakeAPI: API to receive data (FHIR)
- HealthcareLakeETL: Spark ETL job (FHIR→OMOP)
While the root module imports infra/
and uses it once, multiple data lakes can be deployed as shown in this demo.
Motivation
Digital healthcare provided by the NHS in England typically operates in silos. GPs have electronic systems to manage patient care which are distinct from hospital systems, the ambulance service, 111, mental health services etc. Each data owner has a wealth of data that, if combined, would generate a more valuable resource than it does in isolation. While there are solutions to integrate this data for direct care purposes, there is no centralised solution to use this data to inform future care or service provisioning.
Documentation
You can read our docs here.
Usage
Deployment
Initialise the modules
terraform init
Deploy Terraform changes
terraform apply
(Optional) Destroy Terraform infrastructure
terraform destroy