atorus-research/CDISC_pilot_replication

t-14-3-13.R: tidyr::replace_na breaks with column name `0`

Closed this issue · 1 comments

At line 87 of t-14-3-13.R, the data manipulation of

replace_na(list(`0`=' 0       ', `54` = ' 0       ', `81`=' 0       '))

would break.

One example fix is given below.

     mutate(
         `0` = replace_na(`0`, ' 0       '),
         `54` = replace_na(`54`, ' 0       '),
         `81` = replace_na(`81`, ' 0       ')
     )
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
[1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pharmaRTF_0.1.0  assertthat_0.2.1 haven_2.2.0      forcats_0.5.0    stringr_1.4.0    dplyr_0.8.5     
 [7] purrr_0.3.4      readr_1.3.1      tidyr_1.0.2      tibble_3.0.1     ggplot2_3.3.0    tidyverse_1.3.0 
[13] glue_1.4.0      

loaded via a namespace (and not attached):
[1] nlme_3.1-142      fs_1.4.1          lubridate_1.7.8   httr_1.4.1        tools_3.6.2       backports_1.1.6  
 [7] R6_2.4.1          DBI_1.1.0         gnm_1.1-1         colorspace_1.4-1  nnet_7.3-12       withr_2.2.0      
[13] tidyselect_1.0.0  emmeans_1.4.6     curl_4.3          compiler_3.6.2    cli_2.0.2         rvest_0.3.5      
[19] xml2_1.3.2        scales_1.1.0      lmtest_0.9-37     mvtnorm_1.1-0     digest_0.6.25     foreign_0.8-72   
[25] relimp_1.0-5      minqa_1.2.4       rmarkdown_2.1     ca_0.71.1         rio_0.5.16        pkgconfig_2.0.3  
[31] htmltools_0.4.0   lme4_1.1-23       dbplyr_1.4.3      rlang_0.4.6       readxl_1.3.1      rstudioapi_0.11  
[37] generics_0.0.2    zoo_1.8-8         jsonlite_1.6.1    zip_2.0.4         car_3.0-7         magrittr_1.5     
[43] huxtable_4.7.1    qvcalc_1.0.2      Matrix_1.2-18     Rcpp_1.0.4.6      munsell_0.5.0     fansi_0.4.1      
[49] abind_1.4-5       lifecycle_0.2.0   stringi_1.4.6     yaml_2.2.1        carData_3.0-3     MASS_7.3-51.4    
[55] grid_3.6.2        crayon_1.3.4      lattice_0.20-38   splines_3.6.2     hms_0.5.3         knitr_1.28       
[61] pillar_1.4.4      boot_1.3-23       estimability_1.3  vcdExtra_0.7-4    reprex_0.3.0      evaluate_0.14    
[67] data.table_1.12.8 modelr_0.1.7      vcd_1.4-7         vctrs_0.2.4       nloptr_1.2.2.1    cellranger_1.1.0 
[73] gtable_0.3.0      xfun_0.13         openxlsx_4.1.5    xtable_1.8-4      broom_0.5.6       coda_0.19-3      
[79] statmod_1.4.34    ellipsis_0.3.0 

Thanks for this feedback!

This is an interesting issue. The conflict comes from updates made within tibble_3.0.1. When a variable name is set as 0 (using the backticks to allow the variable name to be set like this), the function replace_na sees it as an index instead of a variable name. This actually doesn't happen for other numbers as variable names, like 1 or 2. They both correctly function as variable names and not indices. We have logged his as an issue with tidyr, and you can see our reprex of the issue right here.

In any case, we have updated the R script with your suggested syntax within our latest merge.