/ctru

CTRU R package

Primary LanguageR

minimal R versionORCiD

CTRU R Functions

This repository contains R functions to facilitate work at the Sheffield Clinical Trials Research Unit (CTRU), part of the School of Health and Related Research (ScHARR) at The University of Sheffield. The intention is to share code between colleagues so that common repetitive tasks become trivial and we do not spend time solving the same problems.

Readers may also find the following slides useful. They are written by the author of this package and contain a host of examples and links to additional resoures on using R in a reproducible workflow...

  • RepRoducibility Slides written by the author of this package on using R to work in a reprodcible manner.
  • Reporting in the 21st Century slides for a short presentation to the Medical Statistics Group on working in a reproducible manner.

Installation and Usage

If all you want to do is use these functions then its pretty straight-forward to install them thanks to the devtools package. Install it from CRAN and then install this repository from GitHub...

install.packages('devtools')
devtools::install_github('ns-ctru/ctru')
## And of course load the library
library(ctru)

You can now use the functions read_prospect(), fields_prospect() and so forth.

Shiny Application(s)

The package now includes a Shiny application (i.e. interactive Web page) that allows the calculation of sample sizes using a number of different R packages. A helper function is included so that once you have installed and loaded the library (as describved above) you can start the application using...

ctru_shiny()

Read more about the included Shiny applications below.

Collaborating

To collaborate in this work you will need to install Git on your computer and have a GitHub account. If you're not familiar with either of these you may find the tutorial Conversational Git a useful place to start. The GitHub help pages are also excellent.

Once you've got a GitHub account you need to fork the ns-ctru/ctru repository, clone your fork to your computer to work on it, make changes/addition and push them back to your fork then make a make pull requests.

SSH Keys

I would advocate using SSH Keys with your GitHub account to make it easy to push updates without having to enter your password every single time.

Functions

read_prospect()

  • Function to facilitate reading and labelling of data exported from as plain text files from the CTRU 'bespoke' database Prospect.
  • Uses the exported Lookups.csv to convert all factor variables to the correct encoding.
  • Unfortunately it can't recreate the relational nature of the data that exists within the database from which it has been exported :-/.

ToDo

  • Add functionality to download 'Fields' and 'Froms' tabs from DM Googlesheets using either googlesheets.
  • Have event_name converted to factor internally (may require inclusion of event_name in Lookups.csv that is exported from Prospect).

recruitment()

  • Function to summarise screening and recruitment(/enrolment) to studies.
  • Produces tables and figures overall and by study site.

ToDo

  • Generate plots by site (don't need to do tables, since they can be subsetted from the master)
  • Possibly add option to summarise by treatment arm too.

table_summary()

  • Function to summarise specified measurements (numerical/continuous and factor variables are handled) by the specified subset and time points.
  • For numerical/continuous variables N/Mean/SD/Min/Max/Median/IQR reported for specified variables for the specified grouping.
  • For factor variables that are reported numbers and proportions are reported for the specified variables.

ToDo

  • Full support for Non-Standard Evaluation when explicitly supplying grouing variables as an argument rather than ....

plot_summary()

  • Function to plot specified measurements by specified subset.
  • Produces histograms by specified treatment groups for continuous variables.
  • Produces bar charts by specified treatment groups for factor variables.
  • Pooled plots are produced and optionally individual plots for each variable can be produced.

ToDo

  • For factor variables need to group responses into surveys and facet_grid() them with rows for surveys and columns for the specified groups.
  • Extend factor summaries to be performed by specified events.
  • Finish off plotting continuous variables by variable (rows) and event (columns).

idm_lsoa()

ToDo

  • Add in 2010 data.
  • Add in data on LSOAs in Wales.

eq5d_score()

  • Function for calculating EQ5D-5L (see slide 40 and 41 for scoring). Could possibly have it summarise and plot scores by user-specified variable (default being the event and the group)

ToDo

  • Very much a work in progress, need to fully understand Non-Standard Evaluation to get the function working and fully flexible.

consort()

ToDo

  • Everything, most likely useing diagram package (further examples here).
  • This may not be that straight-forward to abstract in light of the way CTRU data is (un)structured as there is no single file that defines who was seen at what stage, all numbers need extracting from the available data. Kind of the thing that databases are geared towards really.

regress_ctru()

ToDo

  • Include options to set the reference level (via relevel())for each factor variable in a model (something akin to the way texreg() handles things).
  • Option (default) to exponentiate model coefficients and CIs when link function is binomial.
  • Include ability to bootstrap regression results, particularly important for mixed models where p-values are unreliable due to uncertainty in the degrees of freedom. Some leverage to do this via texreg() but stargazer() is a more flexible tabulating option.
  • Include all results from ITT/PP models, coefficients and CIs, p-values as part fo the returned list which can then be parsed for inclusion in text.

Shiny Applications

Shiny applications are included in this packages (currently n = 1). A helper function (ctru_shiny()) is included to start the different applications. It includes the option to specify the display.mode which can be useful if you wish to look at the source code in the application (use the option display.mode = "showcase" if so).

Sample Size Calculations

A WebUI to a number of R packages which will calculate sample sizes and/or power for the specified parameters. . To start it run...

ctru_shiny(example = 'sample_size')

Links

A few links to other resources that people might find useful...

  • RepRoducibility Slides written by the author of this package on using R to work in a reprodcible manner.
  • Reporting in the 21st Century slides for a short presentation to the Medical Statistics Group on working in a reproducible manner.