OHDSI/ETL-Synthea

ETL LoadEventTables

Ryan-nd opened this issue · 4 comments

Hello,

Would like to address the same issue with loading the condition occurrence as it is taking a long time to process as compared to the insert visit occurrence, visit details table. Attached the first picture (Conditions table), 2nd photo is the processing. Do appreciate if any insights could be given.
omop convertion 2
omop convertion 1

I think this depends on a number of factors including the infrastructure that you are running on as well as the amount of data that you are loading. For example here's an execution of ETL-Synthea I ran this morning where insert_condition_occurrence.sql ran very quickly. I don't believe this is a particular issue with the package.

Connecting using PostgreSQL driver
Running: insert_person.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0112 secs
Running: insert_observation_period.sql
|=========================================================================================================================| 100%
Executing SQL took 0.00843 secs
Running: insert_provider.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0237 secs
Running: insert_visit_occurrence.sql
|=========================================================================================================================| 100%
Executing SQL took 0.414 secs
Running: insert_visit_detail.sql
|=========================================================================================================================| 100%
Executing SQL took 0.35 secs
Running: insert_condition_occurrence.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0462 secs
Running: insert_observation.sql
|=========================================================================================================================| 100%
Executing SQL took 3.9 secs
Running: insert_measurement.sql
|=========================================================================================================================| 100%
Executing SQL took 0.881 secs
Running: insert_procedure_occurrence.sql
|=========================================================================================================================| 100%
Executing SQL took 0.787 secs
Running: insert_drug_exposure.sql
|=========================================================================================================================| 100%
Executing SQL took 0.412 secs
Running: insert_condition_era.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0397 secs
Running: insert_drug_era.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0168 secs
Running: insert_cdm_source.sql
|=========================================================================================================================| 100%
Executing SQL took 0.00616 secs
Running: insert_device_exposure.sql
|=========================================================================================================================| 100%
Executing SQL took 0.00632 secs
Running: insert_death.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0877 secs
Running: insert_payer_plan_period.sql
|=========================================================================================================================| 100%
Executing SQL took 0.0186 secs
Running: insert_cost_v300.sql
|=========================================================================================================================| 100%
Executing SQL took 7.98 secs

Hi Frank,

Thanks for the fast response! Would you say its because of Athena Dictionary that affects the performance of my ETL as I uploaded 5GB worth of Data and 1k of Synthetic Data. Appreciate the insights given!

Regards,
Ryan Li

That is possible. I'm working on a new ETL artifact standard to try and help with questions just like this and will let you know when there is some progress there. The best thing I could suggest would be doing an explain plan on the insert statement causing the delay and see if there are any particular issues. Perhaps indexes on the vocabulary tables are missing?

Ooh alright. Will do, appreciate the help and quick response that was given!

Regards,
Ryan Li