The goal of the project is to study the association between mtDNA haplogroups and delirium in sepsis patients.
- Haplogroup:
Mito Delirium BioVU Data/Genetics/CAM_Haplogroups.xlsx
- Date of death (DOD):
Mito Delirium BioVU Data/Demographics/Date of Death data.xlsx
(sheetFINAL DATE OF DEATH DATA
) - All cohort subjects:
datafile/sepsis_grids_20200106.xlsx
(sheetAll GRIDs
) - Admission & discharge dates (include worst sofa score per admission):
datafile/sepsis_compare_20191217.csv
- Daily CAM status:
datafile/daily_status_20190925.csv
- Daily lab (includes sofa, rass, creatinine, platelet ...):
datafile/daily_sofa_score_20191010.csv
- Neuro damage data (remove encounters with "bad" icd codes):
- ICD9:
Mito Delirium BioVU Data/Neuro Exclusions/neuro_icd9_V2.xlsx
- ICD10:
Mito Delirium BioVU Data/Neuro Exclusions/neuro_icd10_V2.xlsx
- ICD9:
- Comorbidity score:
Mito Delirium BioVU Data/Elixhauser Comorbidities/*.xlsx
- Daily dementia:
Mito Delirium BioVU Data/Dementia/Dementia.xlsx
(sheetInitial Dementia Code date
) - Medications
Mito Delirium BioVU Data/Data/grid_date_med1.csv
Mito Delirium BioVU Data/Data/grid_date_med2.csv
Mito Delirium BioVU Data/Data/grid_date_med3.csv
- In ICU death manual review:
datafile/in_ICU_death_manual_review.txt
- Data dictionary:
data_arxiv/analysis_daily_dict.xlsx
- Daily-level data:
data_arxiv/analysis_daily.rds
- Encounter-level data:
grid
andadm_date
in daily-level data uniquely determine an encounter - Subject-level data:
grid
in daily-level data uniquely determine a subject
- TODO
The initial steps have been done here:
- Determine daily status during hospitalization.
- Identify hospital encounters with at least one CAM-ICU assessments.
- Identify encounters with sepsis.
- Step 1 & 2. Determine daily status and identify CAM-ICU encounters
- Step 3. Identify sepsis
- Misc.
- Data Dictionary
The code reconstruct_daily_visit_data.R does the following:
- Clean and combine CAM-ICU data and RASS data, resolves discrepancy
- CAM-ICU is a tool to detect delirium in ICU patients, usually assesed every 8 hours in ICU. Valid CAM-ICU values:
- Positive - Delirium present
- Negative - No deliruim
- UA - Patient in coma, cannot assess CAM-ICU score
- Unk - This value is assigned in anlysis for conflicting CAM-ICU values at the same time point
- RASS measures how awake and alert patient is, usually assesed every 4 hours in ICU. Obtaining a RASS score is the first step in administering CAM-ICU. Valid RASS scores are from -5 to 4.
- If RASS score is -3 to 4, CAM-ICU is assessable and should be either Positive or Negative.
- If RASS score is -5 or -4, patient is in coma, and CAM-ICU value should be UA.
- CAM-ICU and RASS data was joined by GRID and assessment time.
- CAM-ICU is a tool to detect delirium in ICU patients, usually assesed every 8 hours in ICU. Valid CAM-ICU values:
- Determine daily status
- For each day, the daily status is:
- Delirious if any CAM-ICU value was Positive.
- Otherwise Unknown: conflicting CAM if any CAM-ICU value was Unk.
- Otherwise comatose if any RASS value was -5 or -4.
- Otherwise Normal if any CAM-ICU value was Negative.
- Otherwise Unknown: RASS only if all CAM-ICU values were missing and at least one non-missing RASS value.
- Otherwise Unknown: No CAM nor RASS if all CAM-ICU and RASS values were missing.
- Output Data: Girard_BioVU/output/daily_status_20190925.csv
- For each day, the daily status is:
- Identify encounters with at least one CAM-ICU assessment and get encounter/visit level summary
- For each admission/discharge record, find all dates from admission date to discharge dates.
- Merge with daily status obtained above (keep Comatose, Delirious, Normal only) and redefine admission/discharge dates
- Consecutive dates were considered as one encounter, and the first/last dates of the group of consectuvie dates were taken as admission/discharge dates.
- The reason we did all this was because:
- some admission/discharge record did not have a discharge date.
- around 20,000 hospital dates with daily status did not fall into any of the admission/discharge records and we do not want to throw them away.
- Calculate a few summary statistics at encounter level and remove encounters without any CAM-ICU.
- Output Data: Girard_BioVU/output/cam_stay_20190925.csv
- Girard_BioVU/output/data_raw.RData
- Including admission/discharge data, CAM data, and RASS data
- Girard_BioVU/output/changed_grid_dob_20190924.csv
- Data used for correct GRIDs and dates
Refer to these reports for general ideas and more details. However, note that none of them correct for changed GRIDs.
- Girard_BioVU/code/no_git/20190619_cam_gap.html
- Girard_BioVU/code/no_git/20190319_daily_status.html
- Girard_BioVU/code/no_git/20190716_visit_summary.html
There are three ways to identify sepsis.
- Rhee definition (currently used)
- Sepsis-3 definition
- Sepsis ICD code
-
Identify CAM-ICU encounters that meet Rhee's presumed serious infection definition.
- To find >= 4 QADs starting within 2 days of blood culture day:
- Find whether an antibiotic was new, i.e., not given in the prior 2 calendar days.
- Keep only new antibiotics given within 2 days of blood culture day, these are the starting dates of QADs.
- Check whether there are 4 QADs counting from the starting dates.
- For starting daysCalculate # of calender days,
- Code: rhee_infection.R
- Output Data: Girard_BioVU/output/rhee_infection_20191015.csv
- To find >= 4 QADs starting within 2 days of blood culture day:
-
Among the presumed serious infections identified above, find which ones met Rhee's acute organ dysfunction definition.
- Code: rhee_organ_dysfunction.R
- Output Data: Girard_BioVU/output/sepsis_rhee_20191217.csv
- Identify CAM-ICU encounters that meet Sepsis-3's suspected infection definition.
- Code: sepsis3_infection.R
- Output Data: Girard_BioVU/output/sepsis3_all_infections_20190927.csv
- calculate daily SOFA score for all CAM-ICU encounters.
- Code: daily_sofa.R
- Output Data: Girard_BioVU/output/daily_sofa_score_20191010.csv
- Among the suspected infections identified above, find which ones met Sepsis-3's organ dysfunction definition.
- Code: sepsis3.R
- Output Data: Girard_BioVU/output/sepsis3_20191014.csv
The code compare_sepsis.R does the following:
- Compare three criteria at encounter level.
- Output Data: Girard_BioVU/output/sepsis_compare_20191217.csv
- Find distinct GRIDs with sepsis and see which ones have genotype data
- Output Data:
- Girard_BioVU/output/grid_not_in_genotype_status_20200106.csv
- Girard_BioVU/output/sepsis_grids_20200106.xlsx
- Output Data:
- Check the encounters with sepsis code but negative for both sepsis definitions
- We Decide to use Rhee definition only to identify sepsis for now.
- Girard_BioVU/code/20191120_sepsis_compare.html
- Having missing data summary for Sepsis-3 definition.
- Girard_BioVU/code/20200106_sepsis_compare.html
- Most current version of comparing three criteria.
2000+ GRIDs were changed due to EHR system switching. Since the dates were shifted by different amount for each GRID, not only the GRIDs but also the dates need to be corrected. The code changed_grid_dob.R outputs the DOBs for old and updated GRIDs.
Supposedly, only the older data had the changed GRIDs problem. However, I recommend always check whether old GRIDs exist in any data used, adn follow the following two steps if old GRIDs do exist.
- Convert the old GRIDs to updated GRIDs.
- Convert all dates of the old GRIDs by date - old_dob + updated_dob.
- Input Data:
- Girard_BioVU/output/data_raw.RData
- static_raw had all GRIDs in old EHR system and DOB.
- Mito Delirium BioVU Data/Data/Changed_GRIDS.xlsx
- Old and updated GRIDs only, no DOB.
- Mito Delirium BioVU Data/Demographics/Set_*_20180830_demo.txt
- GRID, primary GRID (if GRID was old and changed), and DOB for all GRIDs.
- DOB discrepancy between this file and the other two sources.
- Mito Delirium BioVU Data/Demographics/Sample_Genotyping_Status.xlsx
- GRID and DOB.
- Girard_BioVU/output/data_raw.RData
- Output Data:
- Girard_BioVU/output/changed_grid_dob_20190924.csv
- Girard_BioVU/output/dob_discrepancy.csv
- discrepancy in DOB between Mito Delirium BioVU Data/Demographics/Set*20180830_demo.txt and other two sourcese for DOB, can ignore.
The code check_resp_ratio.R calculates respiration ratios for SOFA score. I believe we will get more respiration data in the future.
- Calculate PaO2/FiO2 and compare with already available ratio data.
- Decide to use calculated PaO2/FiO2 instead of already available ratio data.
- Input Data:
- Mito Delirium BioVU Data/Lab values/PO2_FIO2_ratio/*.xlsx is the already available ratio data.
- Mito Delirium BioVU Data/Lab values/FIO2/*.xlsx
- Mito Delirium BioVU Data/Lab values/Arterial pO2/*.xlsx
- Output Data: Girard_BioVU/output/pao2_fio2_ratio_calc_20190927.csv
- Check and correct FiO2 values
- FiO2 is a fraction and should be 0-1.
- Any FiO2 >= 100 was divided by 100.
- Check FiO2 < 0.21 with Nasal O2 data.
- Calculate SpO2/FiO2
- Input Data:
- Mito Delirium BioVU Data/Lab values/FIO2/*.xlsx
- Mito Delirium BioVU Data/Lab values/O2Sat/*.xlsx
- Output Data: Girard_BioVU/output/spo2_fio2_ratio_calc_20191010.csv
- Input Data:
The code check_sepsis_discrepancy.R checks why some encounters only met the Rhee definition but not the Sepsis-3 definition.
The code check_pt_loc.R tabulates patient location datato see whether it will help to identify whether they were in ICU. Decide not to use for now.
- Input Data: Mito Delirium BioVU Data/Lab values/patient_Location/*.xlsx
- Output Data: Girard_BioVU/output/patient_cam_visit_location_count.csv
Data dictionary can be found in the data_dict folder for currently in-use output data. They have the same name as the output data.