Improve experience for non-CDISC data
ddsjoberg opened this issue · 1 comments
ddsjoberg commented
The visualizations for time to event data in visR are amazing and will be used by many people, including those whose data does not follow CDISC conventions. We can do a couple of things to improve their experience, while not taking away from the experience of those working with CDISC data.
- Export a function to convert conventional
Surv(time, event)
coding toAVAL, CNSR
coding. I don't think i'll ever memorize that AVAL is the time column's name 😆 Something small and easy like the example below would work. - When data is used that does not have 'PARAM' and 'PARAMCD' columns,
visr()
prints a warning (see below) about the x-axis label. We don't need a warning to tell users what was not done. This should be handled similarly to other places in the package regarding labels. For example, if a column has a label, we use it. But if the column doesn't have a label, we use the variable name invisr()
: we don't print a warning that the variable label wasn't used. It's already well documented in the thevisr()
help file, that if the 'PARAM' and 'PARAMCD' are present, their values will be used to construct the x-axis label. Let's remove this warning
library(visR)
as_CDISC_names <- function(data, time, event) {
time <- dplyr::select(data, {{ time }}) %>% names()
event <- dplyr::select(data, {{ event }}) %>% names()
Surv <- survival::Surv(time = data[[time]], event = data[[event]]) %>% unclass()
# convert to indicator of censoring event
data[[event]] <- 1 - Surv[, 2]
# rename columns to be in CDISC format
data %>%
dplyr::rename(AVAL = !!time, CNSR = !!event)
}
survival::lung %>%
as_CDISC_names(time, status) %>%
estimate_KM(strata = "sex") %>%
visr()
#> Warning in visr.survfit(.): The x-axis label was not specified and could also
#> not be automatically determined due to absence of 'PARAM' and 'PARAMCD'.
Created on 2022-05-10 by the reprex package (v2.0.1)
I think it's possible that a better implementation of the first point is possible. I think many users would like a familiar format of data and formula. Maybe we can implement a formula method for estimate_KM()
.
estimate_KM <- function(x, ...) {
UseMethod("estimate_KM")
}
estimate_KM.data <- function(data, AVAL, CNSR, strata, ....) {
}
estimate_KM.formula <- function(formula, data) {
# parse the formula and pass elements to `estimate_KM.data()`
}
ddsjoberg commented
this should be two issues. i'll resubmit