MSKCC-Epi-Bio/tidycmprsk

Unexpected outputs using `car::Anova()` on a `tidycrr` object?

fdehrich opened this issue · 7 comments

Hi there,

Following up on a previous issue with more specific information.

Background: When using add_global_p() with crr(), I noticed that the global p-values for some variables seemed off. Specifically, when I releveled a 4-level categorical variable (keeping the reference category the same), the global p-value changed.

The unexpected values in the table seem to be consistent with the values produced by running car::Anova() directly on the tidycrr object - so there might be something happening in that underlying process. I have not seen this issue thus far when using gtsummary::tidy_wald_test() instead.

Below, I run two version of the same model - just with different leveling for the new_categorical variable. The results in the table track with the car::Anova() results. In addition to the global p-value for new_categorical changing, the degrees of freedom seem strange.

Apologies that the sample dataset/model is complex - I have only been able to reproduce this issue so far with messy models...

I will look into this more to see if I can find anything else helpful!

library(gtsummary)
library(tidycmprsk, warn.conflicts = FALSE)
packageVersion("gtsummary")
#> [1] '1.7.1'
packageVersion("tidycmprsk")
#> [1] '0.2.0'

# Set seed for data generation
set.seed(123)

# Update trial data to include additional variables
trial_updated <-
  trial %>% 
  dplyr::mutate(
    new_binary_1 = sample(c(0, 1), replace = TRUE, size = nrow(trial)),
    new_binary_2 = sample(c(0, 1), replace = TRUE, size = nrow(trial)),
    new_categorical = sample(c("A", "B", "C", "D"), replace = TRUE, size = nrow(trial)),
    marker = dplyr::case_when(marker > 1 ~ 1, marker <= 1 ~ 0)
  )

# Fit model
fit <-
  crr(Surv(ttdeath, death_cr) ~ ., 
      failcode = "death from cancer",
      cencode = "censor",
      trial_updated)
#> 27 cases omitted due to missing values

# Display global p-values from tidycmprsk
fit %>%
  tbl_regression(exp = TRUE, include = new_categorical) %>%
  add_global_p(keep = TRUE) %>% 
  as_kable()
Characteristic HR 95% CI p-value
new_categorical 0.14
A
B 0.53 0.20, 1.42 0.2
C 0.51 0.21, 1.25 0.14
D 0.91 0.40, 2.10 0.8
# Display global p-values from car::Anova()
car::Anova(fit) %>% 
  dplyr::filter(., rownames(.) == "new_categorical") %>%
  knitr::kable()
Df Chisq Pr(>Chisq)
new_categorical 1 2.150062 0.1425642
# Relevel `new_categorical` variable
trial_updated <-
  trial_updated %>% 
  dplyr::mutate(
    new_categorical = forcats::fct_relevel(new_categorical, "A", "C", "B", "D")
  )


# Re-fit model
fit <-
  crr(Surv(ttdeath, death_cr) ~ ., 
      failcode = "death from cancer",
      cencode = "censor",
      trial_updated)
#> 27 cases omitted due to missing values

# Display global p-values from tidycmprsk
fit %>%
  tbl_regression(exp = TRUE, include = new_categorical) %>%
  add_global_p() %>% 
  as_kable()
Characteristic HR 95% CI p-value
new_categorical 0.2
A
C 0.51 0.21, 1.25
B 0.53 0.20, 1.42
D 0.91 0.40, 2.10
car::Anova(fit) %>% 
  dplyr::filter(., rownames(.) == "new_categorical") %>%
  knitr::kable()
Df Chisq Pr(>Chisq)
new_categorical 1 1.58982 0.2073519

Created on 2023-05-01 with reprex v2.0.2

Well, I had no idea car::Anova() worked for tidycmprsk objects and we wrote the gtsummary Wald test for the crr model!

I am not sure what to do other than improve the documentation here that we should specify the wald test.

@tengfei-emory thoughts?

@ddsjoberg I think the test used should be aod::wald.test(). I never used car::Anova() for a crr or tidycrr object and I am not sure how it works.

One other thing I could do is make the function calculates the global p-value in add_global_p() an S3 method, and we could export from tidycmprsk a function to dispatch the Wald test instead of car::Anova().

Unfortunately, I just made a gtsummary release a couple of days ago, so that would have to wait for quite a while.

Thanks so much for looking into this (and for all the work you do creating these packages more generally!).

In the meantime, I'll be sure to specify the Wald test when using gtsummary::add_global_p() with tidycmprsk::crr().

@fdehrich this is addressed in this PR #110

Please have a look to confirm it addresses this issue.

@fdehrich we're going to send the package to CRAN soon. Can you confirm this update addresses this issue?

Just ran the reprex code on my end and the p-values are as expected now, thank you!!