Crystal R. Musser, M.S. (Wet Lab Manager, Katrina Claw Lab @ CU Anschutz, Department of Biomedical Informatics)
Combine quantified LC/MS nicotine metabolite data with phase survey self-reported data, from specified cohort.
"t_allSamplesMetabolites" for curated nicotine metabolite dataset "t_allSamples" as the master (all variables all samples dataframe, post-QC, after specific exclusions) "clean_df" as the finalized, shared dataset, 18 variables, n=816 (inc. 28 controls)
Data dictionaries (Supplementary File 1) "Clean" variable consensus/logic UML (Supplementary File 2)
- combined experimental LC-MS/MS calculated and QC'd data in [ng/mL] serum concentrations for: , Cotinine, 3-hyroxy-Cotinine, Nicotine, CO, NO, Nornicotine (in this order) ---> "v6RawData_combined.csv" 2, 3) cohort survey results with associated data dictionary for interpretation ---> "phase2.xlsx" and "smoking.xlsx"
- phase I survey results with associated data dictionary for interpretating age, gender, smoking histotry, height, weight ---> "phase1.xlsx"
- A list of all combined results for all 824 samples; t_allSamples_postExclusions, file = "SHSph2_fullDF_05152024CRM.csv"
- A "clean" dataset with 18 variables for the final 816 participants to use for EWAS/GWAS; clean_df, file = "SHSph2_NMRwSurveyDF_clean05152024CRM.csv"
- A list of just sampleIDs of final 816 participants; clean_df$idNo, file = "SHSph2_IDs_clean05152024CRM.csv"
- I'm bringing in calculated metabolic data using Kaja's data output from Teddy's script referencing Laura's library, and am assuming reported values pass QC
- I intend on chaning the hardcoded bits into variables for scalability in a larger bioinformatic pipeline (dev in progress, 05/15/2024 CRM)
Calculated metabolite concentration data > Dataframe curation Script > Data Cleaning Script
Raw experimental LC-MS/MS data > metabolite standard curve generation > concentation back-calculations
Data Analysis Script > Figures & Tables Script
- write in a check that there's not data being overwritten for that sample's concentration variable (ie reruns in the same file), handle this how?
- join with samples on my rerun list .xlsx