pharmaverse/datacutr

process_cut needs a dataset for every argument

Closed this issue · 1 comments

What happened?

You need to specify at least one dataset for the arguments paitent_cut_v, data_cut_m and no_cut_v. In my example I didn't want to exclude any dataset from the cut. If I leave that argument unfilled, I get an error. If I put in an empty vector, I get the error

Error: Every input SDTM dataset must be referenced in exactly one of patient_cut_v,
             date_cut_m or no_cut_v. Note that, if special_dm=TRUE, there is no need to
             specify dm in patient_cut_v, date_cut_m or no_cut_v

The only way to avoid this is to specify one of my datasets as not needing a data cut.

Session Information

R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] datacutr_0.0.1            dp00001_sdtmvr_0.0.0.9000 jsonlite_1.8.4           

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10       lubridate_1.9.1   rmint.sdtm_1.7.3  tidyr_1.2.1       prettyunits_1.1.1 ps_1.7.2         
 [7] arrow_10.0.1      assertthat_0.2.1  rprojroot_2.0.3   digest_0.6.31     utf8_1.2.2        mime_0.12        
[13] R6_2.5.1          odbc_1.3.4        roak_3.0.2        httr_1.4.4        pillar_1.8.1      rlang_1.0.6      
[19] rstudioapi_0.14   miniUI_0.1.1.1    callr_3.7.3       urlchecker_1.0.1  blob_1.2.3        desc_1.4.2       
[25] devtools_2.4.5    admiraldev_0.2.0  readr_2.1.3       stringr_1.5.0     htmlwidgets_1.6.1 bit_4.0.5        
[31] shiny_1.7.4       compiler_4.1.3    httpuv_1.6.8      pkgconfig_2.0.3   pkgbuild_1.4.0    arsenal_3.6.3    
[37] clipr_0.8.0       htmltools_0.5.4   tidyselect_1.2.0  tibble_3.1.8      fansi_1.0.4       crayon_1.5.2     
[43] dplyr_1.0.10      tzdb_0.3.0        withr_2.5.0       later_1.3.0       xtable_1.8-4      lifecycle_1.0.3  
[49] DBI_1.1.3         magrittr_2.0.3    cli_3.6.0         stringi_1.7.12    cachem_1.0.6      renv_0.16.0      
[55] fs_1.6.0          promises_1.2.0.1  remotes_2.4.2     ellipsis_0.3.2    generics_0.1.3    vctrs_0.5.2      
[61] tools_4.1.3       bit64_4.0.5       glue_1.6.2        diffdf_1.0.4      purrr_1.0.1       hms_1.1.2        
[67] yaml_2.3.7        processx_3.8.0    pkgload_1.3.2     fastmap_1.1.0     timechange_0.2.0  rice_3.0.1       
[73] sessioninfo_1.2.2 memoise_2.0.1     profvis_0.3.7     usethis_2.1.6    

Reproducible Example

Bit hard to do as a reprex, but if you have source_data with names "ae" "dm" "ds" "ec" "lb" "vs"

cut_data <- process_cut(
  source_sdtm_data = source_data,
  patient_cut_v = c( "ds"),
  date_cut_m = rbind(
    c("ae", "AESTDTC"),
    c("lb", "LBDTC"),
    c("vs", "VSDTC"),
    c("ec", "ECSTDTC")
  ),
  dataset_cut = dcut,
  cut_var = DCUTDTM,
  special_dm = TRUE
)

cut_data <- process_cut(
  source_sdtm_data = source_data,
  patient_cut_v = c( "ds"),
  date_cut_m = rbind(
    c("ae", "AESTDTC"),
    c("lb", "LBDTC"),
    c("vs", "VSDTC"),
    c("ec", "ECSTDTC")
  ),
  no_cut_v = "",
  dataset_cut = dcut,
  cut_var = DCUTDTM,
  special_dm = TRUE
)

all error

Hello @kieranjmartin, thanks for raising this issue.
Could you please take a look into the 134-process_cut-needs-dataset-every-arguement branch and let me know if you are happy with this update?
I have added default values for the following function arguments:

  • patient_cut_v = vector()
  • date_cut_m = matrix(nrow=0, ncol=2)
  • no_cut_v = vector()

In your reproducible examples above, this means that your first example should now work because no_cut_v will be set to the default value (an empty vector). However, your second example will still fail since it is expected that no_cut_v is a vector but you have set this to "" here. Hopefully this is clear from the documentation of the function but please let know if you think further updates are required.