synthetichealth/synthea

Explain inner workings of synthea

Zethson opened this issue · 3 comments

Requested Feature

Hi,

I'm thinking of tackling the massive task of implementing synthea in Python with a team of mine. However, we find the current architecture of Synthea a bit confusing and were wondering there's a more detailed writeup somewhere beyond the wiki or whether one of the main developers were willing to spend an hour with me to dive into the details?

We know that there's a lot of JSON files that define the rules for the implementing modules to generate the data, but the details are somewhere lost in the Java code.

Understanding the architecture is less important than understanding how the disease models work.

Any implementation of an engine that can execute the disease models should be sufficient, regardless of what classes or objects you design.

Our first paper has some high level description: https://doi.org/10.1093/jamia/ocx079

You should also study the Generic Module Framework, all the sub-articles, and the example walkthrough.

The architecture diagram is a high-level, because the particular Java packages and classes don't matter too much.

You have disease models (inputs, as described in GMF) that get applied timestep by timestep to each Patient. As the Patient goes through the models, states alter statuses on the patient and can modify the patient's Health Record. At the end of the simulation, exporters convert the Health Record into various formats.

Which patients are created depends on the census demographics (an input).

The healthcare facilities those patients visit during their simulated lives (for example, hospitals or clinics) depends on the provider files (another input).

Great, thank you. A few follow up questions:

  1. How do you advance time? What defines the step size for time? How does this work?
  2. What are the data structures that you use to for the engine to keep track of the state of the patient?
  1. You advance time by adding the length of the timestep to the current time:time = time + timestep

The timestep is defined as a parameter: generate.timestep

this.timestep = Long.parseLong(Config.get("generate.timestep"));

generate.timestep = 604800000
# time is in ms
# 1000 * 60 * 60 * 24 * 7 = 604800000

  1. State of the patients... where they are in the modules...

https://github.com/synthetichealth/synthea/tree/master/src/main/java/org/mitre/synthea/engine

What is in the medical record..

https://github.com/synthetichealth/synthea/blob/master/src/main/java/org/mitre/synthea/world/concepts/HealthRecord.java