/xportr

Tools to build CDISC compliant data sets and check for CDISC compliance.

Primary LanguageROtherNOASSERTION

xportr

R build status Lifecycle: experimental

Welcome to xportr! We have designed xportr to help get your xpt files ready for transport either to a clinical data set validator application or to a regulatory agency This package has the functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).

As always, we welcome your feedback. If you spot a bug, would like to see a new feature, or if any documentation is unclear - submit an issue on xportr’s Github page.

Installation

Development version:

devtools::install_github("https://github.com/atorus-research/xportr.git")

CRAN

  • As this is an experimental package and under development we have not made it available on CRAN.

What is xportr?


xportr is designed for clinical programmers to create CDISC compliant xpt files- ADaM or SDTM. Essentially, this package has two big components to it - writing xpt files with well-defined metadata and checking compliance of the data sets. The first set of tools are designed to allow a clinical programmer to build a CDISC compliant xpt file directly from R. The second set of tools are to perform checks on your data sets before you send them off to any validators or data reviewers.



What are the checks?


  • Variable names must start with a letter.
  • Variables names are ≤ 8 characters.
  • Allotted length for each column containing character (text) data should be set to the maximum length of the variable used across all data sets (≤ 200)
  • Coerces variables to only numeric or character types
  • Display format support for numeric float and date/time values
  • Variable labels are ≤ 200 characters.
  • Data set labels are ≤ 40 characters.
  • Presence of non-ASCII characters in Variable Names, Labels or data set labels.

NOTE: Each check has associated messages and warning.

Example

The first example involves an ADSL data set in the .sas7bdat format with associated specification in the .xlsx format.

adsl <- haven::read_sas("inst/extdata/adsl.sas7bdat")

var_spec <- readxl::read_xlsx("inst/specs/ADaM_spec.xlsx", sheet = "Variables") %>%
  dplyr::rename(type = "Data Type") %>%
  rlang::set_names(tolower)
  
data_spec <- readxl::read_xlsx("inst/specs/ADaM_spec.xlsx", sheet = "Datasets") %>%
  rlang::set_names(tolower) %>%
  dplyr::rename(label = "description")
  
adsl %>%
  xportr_type(var_spec, "ADSL", "message") %>%
  xportr_length(var_spec, "ADSL", "message") %>%
  xportr_label(var_spec, "ADSL", "message") %>%
  xportr_df_label(data_spec, "ADSL") %>%
  xportr_write("adsl.xpt")

Where to go from here?

Please check out the Get Started for more information.

We are in talks with other Pharma companies involved with the {pharmaverse} to enhance this package to play well with other downstream and upstream packages.

References


This package was a developed jointly by GSK and Atorus.