OHDSI/ETL-Synthea

Cancer ETL

Closed this issue · 11 comments

Hi,

I am working on Cancer ETL. What is the best order to follow in populating domain?

Episode
Episode_event
Condition_occurrence
procedure
drug_exposure
Measurement (link to condition_occurrence)

Does the above order seem reasonable?

Thanks,
Priya

Hi Priya, I would actually put EPISODE and EPISODE_EVENT at the bottom of your list. They are meant to group facts from CONDITION_OCCURRENCE, PROCEDURE_OCCURRENCE, DRUG_EXPOSURE and MEASUREMENT into episodes of care. I think this presentation gives a good overview of how these tables work together.

https://github.com/OHDSI/CommonDataModel/files/2642492/Oncology.CDM.Proposal.2018-12-02.pdf

Hi Team,

The presentation provided is very useful. Claire, Thanks so much. For mapping conditions and measurement, I noticed that in Synthea data, the conditions have SNOMED CT . It is not in ICD-O. I can map the SNOMED as such right? I need to modify the measurement table to include modifier_id and modifier_event_id and link the measurement records and corresponding conditions. Is my understanding right?

Thanks,
Priya

Hi Priya, you are correct :) you will need to modify the measurement table as you say and you can definitely use SNOMED for this example.

Hi Claire,
I have created measurement temp table with record_id and modifier_field_Concept_id for cancer related conditions that has measurements related to size, grade, stage etc., Can you please take a look and let me know, if my approach is right? I have included the sql script, the measurement temp table output , the source data and OMOP condition_occurrence table.

After you confirm, I can start working on episodes.

Thanks,
Priya
questions.zip

Very nice, @priagopal. Almost perfect.

You are using LOINC concepts for the cancer modifiers. But we created a new vocabulary called Cancer Modifier. For example, you use Regional lymph nodes.clinical [Class] Cancer. Instead, you should use Regional Lymph Nodes.

Right now, we don't have all the mappings from LOINC, SNOMED etc. Working on it. Once that is done we can de-standardize those and you wouldn't be tempted to use them for your data any longer.

Hi Christian,

Thanks so much for your review. I can update the ETL script after the regional lymph nodes are available in the concept table. I think, I might have to do the same for other finding values related to cancer that are mapped to LOINC instead of cancer modifier (e.g: distant metastases).

I was thinking, if I could create the draft EPISODE and EPISODE_EVENT ETL scripts for the Synthea data. With the understanding, some of the concept information might change based on the updates made to the standard concepts.

Please let me know, if this is an acceptable approach.

Thanks,
priya

I know. We feel guilty pulling the carpet while you are on it. But it sounds worse than it is.

Do you have a list of source codes you want to capture as Cancer Modifiers or Episodes?

Hi Christian,

I have attached the distinct cancer modifiers. Based on the data, the episodes are first occurrence, treatment cycle and cancer surgery.

Please let me know, if you need any other information.

Thanks so much for your help.

Priya

CancerModifier.csv

Hi Christian/Claire,

In the example below, the condition records related to first occurrence, relapse, remission and included in EPISODE domain. But, in Athena, the above concepts are invalidated. I wasn't sure whether I model only treatment abstractions in episode domain and not condition abstraction?

I was referring to the attached presentation to understand Episode mapping.
Oncology.CDM.Proposal.2018-12-02 (2).pdf

Thanks,
priya

Thanks for the interest. The status of the disease should be abstracted as Episode if the source data provides enoigh information to do it obviously.
The invalidated codes were redefined but no mappings were added. The relapse/remission is a property of Dynamics and they should be mapped to appropriate concepts:
Remission and Progression (as it is pretty identical to Relapse)