/synthea-rdf

Semantic web representation for the Synthea.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

🕸️ SYNTHEA RDF

KnAcc Lab License: GPL v3

Semantic web representation for the SyntheaTM and CSVs to Turtle (.ttl) conversion tool.

synthea_ontology

🔨 Installation

Method 1: Poetry

Poetry installation guide

  1. clone the repo
  2. python3 -m venv .venv
  3. source .venv/bin/activate
  4. poetry install

🔌 Activate .venv environment everytime before using synthea-rdf by running source .venv/bin/activate command.

Method 2: Pip

pip install synthea-rdf

⚡ Usage

Conversion

All conversion configurations should be specified in configuration.yaml.

Here is a sample configuration.yaml.

model_path: synthea_ontology/synthea_ontology.ttl
synthea_csv_path: ../synthea/output/1000k/csv
output_path: result/1000k
chunk_size: 300000
include_dua: True
include_trustscore: True
skip:
  - allergies.csv
  - careplans.csv
  - claims_transactions.csv
  - claims.csv
  - conditions.csv
  - devices.csv
  - encounters.csv
  - imaging_studies.csv
  - immunizations.csv
  - medications.csv
  - observations.csv
  - organizations.csv
  - patients.csv
  - patient_expenses.csv
  - payer_transitions.csv
do_shutdown: False

After specification, simply run:

python3 conversion.py

Running conversion process with TMUX

The bigger the data size, the more time that the data conversion needs. In this case, it would be better to use CLI in the background and check the progress time to time. The best way is to run the process in a TMUX session and detach it. It is possible to check the progress by attaching the TMUX session.

Example:

  1. $ tmux
  2. $ python3 conversion.py
  3. Press [CTRL]+[b], then [d] to detach the TMUX session.
  4. Now it is okay to log off. (:warning:DO NOT SHUT DOWN THE MACHINE!!)
  5. $ tmux a to attach the session and check the progress

Trust Score and DUA generation

Use Trust score and Data Usage Agreement (DUA) generator to generate optional Trust Score and DUA data.

python3 trustscore_dua_generator.py

trustscore_dua_generator