What is this?

This repository includes pipelines to transform data from a FHIR server (like HAPI, GCP FHIR store, or even OpenMRS) using the FHIR format into a data warehouse based on Apache Parquet files, or another FHIR server. There is also a query library in Python to make working with FHIR-based data warehouses simpler.

These tools are intended to be generic and eventually work with any FHIR-based data source and data warehouse. Here is the list of main directories with a brief description of their content:

pipelines/ *START HERE*: Batch and streaming pipelines to transform data from a FHIR-based source to an analytics-friendly data warehouse or another FHIR store.
docker/: Docker configurations for various servers/pipelines.
doc/: Documentation for project contributors. See the pipelines README and wiki for usage documentation.
utils/: Various artifacts for setting up an initial database, running pipelines, etc.
dwh/: Query library for working with distributed FHIR-based data warehouses.
bunsen/: A fork of a subset of the Bunsen project which is used to transform FHIR JSON resources to Avro records with SQL-on-FHIR schema.
e2e-tests/: Scripts for testing pipelines end-to-end.

NOTE: This was originally started as a collaboration between Google and the OpenMRS community.

lincmba/fhir-data-pipes

What is this?