GPC Breast Cancer Data Quality Reporting
based on earlier work by Jianghua He, with contributions from Bradley McDowell and Theresa Shireman, under the direction of Elizabeth Chrischilles
by Dan Connolly, with Russ Waitman, Tamara McMahon, and Vince Leonardo
Medical Informatics Division, Univeristy of Kansas Medical Center
Copyright (c) 2015 Univeristy of Kansas Medical Center
Share and Enjoy according to the terms of the MIT Open Source License.
Background
On 23 Dec 2014, GPC honest brokers were requested to run a breast cancer cohort query
and submit results (see bc_qa2.Rmd
for details). All participating sites have
now done so, and we are evaluating the results (227) using automated reports
built with R Markdown.
An initial QA report was sent to each site 23 Feb 2014.
Site Usage
In future iterations, sites are encouraged to run this report on their own before sumitting:
- Get the
bc_qa
code- Visit bc_qa downloads and get a zip file, or
- clone the https://bitbucket.org/gpcnetwork/bc_qa repository
- Build Query Terms and Exclusion Criteria article
- In RStudio,
install.packages(pkgs=c('RSQLite', 'ggplot2', 'reshape', 'xtable'))
- Open
bc_qa2.Rmd
and Knit HTML- output:
bc_terms_results.RData
- output:
- In RStudio,
- Build QA for SITE article
- Use DataBuilder or equivalent to generate sqlite file.
- Copy
dataset-example.R
todataset.R
and edit filename etc. - Knit
bc_excl.Rmd
Central Usage
As new submissions come in, members of the breast cancer research team can reproduce the analysis of data from all sites:
- Fetch all the data files.
- Knit
bc_fetch.Rmd
to buildbc_fetch_results.Rmd
- Knit
- Build Query Terms and Exclusion Criteria article as above
- Build any QA for SITE articles you like, using
dataset.R
as below. - Build Data by Site presentation
- Open
bc_qa_p1.Rpres
in R Studio and use the presentation tab.
- Open
- To mail results to all sites
- comment out
SITE <- ...
indataset.R
and runreport-all-sites.R
. - Move the
report-SITE.html
files todata-files
. - Use
report_mail.py
to mail the reports.
- comment out
load("bc_fetch_results.RData")
SITE <- 'KUMC' # Salt to taste
(function (s) {
list(
conn=fetch$site.data(s),
about=subset(fetch$dataset, site == s)
)
})(SITE)