sahirbhatnagar/casebase

Standard error for hazard ratio

Closed this issue ยท 5 comments

Here's my attempt at using the delta method (ref1, ref2, ref3) to calculate the standard error of the time-dependent hazard ratio. The calculation is straightforward if there is no interaction with time.
The harder case is when there is an interaction with time. I am able to get reasonable results using the delta method when there is a linear interaction with time (trt:time), but not when there is a non-linear interaction (trt:bs(time, df=3)). I'm not sure why. Perhaps the way the variance-covariance matrix is calculated?

# load packages and data --------------------------------------------------

pacman::p_load(rstpm2) # for nsx function
pacman::p_load(splines)
pacman::p_load(numDeriv) # Load numerical derivative package
devtools::load_all("~/git_repositories/casebase/")
#> Loading casebase
#> See example usage at http://sahirbhatnagar.com/casebase/
data("simdat") # from casebase package
str(simdat)
#> 'data.frame':    2000 obs. of  4 variables:
#>  $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ eventtime: num  3.38 2.89 2.76 5 2.67 ...
#>  $ status   : int  1 1 1 0 1 1 0 1 1 0 ...
#>  $ trt      : int  0 0 0 1 0 0 0 0 0 1 ...


# Fit casebase models -----------------------------------------------------
# without splines ---------------------------------------------------------
fit1 <- fitSmoothHazard(status ~ trt*eventtime,
                        data = simdat,
                        time = "eventtime")
summary(fit1)
#> 
#> Call:
#> glm(formula = formula, family = binomial, data = sampleData)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -0.2186  -0.1602  -0.1331  -0.1127   3.3677  
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   -2.18460    0.07971 -27.407  < 2e-16 ***
#> trt           -0.71408    0.12849  -5.558 2.74e-08 ***
#> eventtime      0.25732    0.02748   9.364  < 2e-16 ***
#> trt:eventtime  0.10516    0.04168   2.523   0.0116 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 13633  on 122714  degrees of freedom
#> Residual deviance: 13365  on 122711  degrees of freedom
#> AIC: 13373
#> 
#> Number of Fisher Scoring iterations: 8

# with splines ------------------------------------------------------------
fit2 <- fitSmoothHazard(status ~ trt*nsx(eventtime,df=3),
                        data = simdat,
                        time = "eventtime")
summary(fit2)
#> 
#> Call:
#> glm(formula = formula, family = binomial, data = sampleData)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -0.2018  -0.1650  -0.1388  -0.1096   3.5253  
#> 
#> Coefficients:
#>                             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)                  -2.5563     0.1729 -14.787  < 2e-16 ***
#> trt                          -0.9927     0.3112  -3.190  0.00142 ** 
#> nsx(eventtime, df = 3)1       0.9279     0.1679   5.525 3.29e-08 ***
#> nsx(eventtime, df = 3)2       2.1789     0.3983   5.470 4.50e-08 ***
#> nsx(eventtime, df = 3)3       1.0054     0.1424   7.059 1.68e-12 ***
#> trt:nsx(eventtime, df = 3)1   0.4371     0.2640   1.656  0.09771 .  
#> trt:nsx(eventtime, df = 3)2   1.1530     0.6948   1.660  0.09701 .  
#> trt:nsx(eventtime, df = 3)3   0.5669     0.2111   2.686  0.00724 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 13633  on 122714  degrees of freedom
#> Residual deviance: 13343  on 122707  degrees of freedom
#> AIC: 13359
#> 
#> Number of Fisher Scoring iterations: 8

# generate sequence of times for which to calculate hazard ratio ------------
newtime <- quantile(fit1[["data"]][[fit1[["timeVar"]]]], probs = seq(0.05, 0.95, 0.01))
newdata <- data.frame(trt = 1, eventtime = newtime, offset = 0)
newdata.ref <- data.frame(trt = 0, eventtime = newtime, offset = 0)

# Calculate Hazard Ratio as a function of time ----------------------------

# The hazard ratio for trt is
# exp(b0 + b1 + b2 * time + b3 * time) / exp(b0 + b2 * time) =
# exp(b1 + b3 * time)

# this gives beta * X (row wise multiplication) resulting in a matrix with dimension= dim(X)
# estimate using the predict function with type = "terms"
# see https://stackoverflow.com/questions/37963904/what-does-predict-glm-type-terms-actually-do
# about the note on centering. we need to add back the column means
bX1 <- predict(fit1, type = "terms",
               newdata = newdata,
               terms = c("trt","trt:eventtime"))

beta1 <- coef(fit1)
mm1 <- model.matrix(fit1)
avx1 <- colMeans(mm1)
int_names <- c("trt","trt:eventtime") # variables needed for HR
add_back1 <- sum(avx1[which(names(avx1) %in% int_names)] * beta1[which(names(avx1) %in% int_names)])

# this is the hazard ratio as a function of time
log_hazard_ratio1 <- rowSums(bX1) + add_back1

# sanity check that we're getting same hazard ratio as estimate_hazard
all.equal(exp(log_hazard_ratio1),
          exp(estimate_hazard(fit1, newdata = newdata)) /
              exp(estimate_hazard(fit1, newdata = newdata.ref)))
#> [1] TRUE


bX2 <- predict(fit2, type = "terms",
               newdata = newdata,
               terms = c("trt","trt:nsx(eventtime, df = 3)"))

beta2 <- coef(fit2)
mm2 <- model.matrix(fit2)
avx2 <- colMeans(mm2)
int_names <- c("trt",paste0("trt:",paste0("nsx(eventtime, df = 3)",1:3))) # variables needed for HR
add_back2 <- sum(avx2[which(names(avx2) %in% int_names)] * beta2[which(names(avx2) %in% int_names)])

# this is the hazard ratio as a function of time
log_hazard_ratio2 <- rowSums(bX2) + add_back2

# sanity check that we're getting same hazard ratio as estimate_hazard
all.equal(exp(log_hazard_ratio2),
          exp(estimate_hazard(fit2, newdata = newdata)) /
              exp(estimate_hazard(fit2, newdata = newdata.ref)))
#> [1] TRUE


# Calculate Standard Error of Time-Dependent HR Using Delta Method ---------------

v1 <- vcov(fit1)

# this is the time dependent hazard ratio function
f.HR1 <- function(beta, time){
    return(beta[2] + beta[4]*time)
}

# Standard errors for each time using delta method
SE_time1 <- sapply(newtime, function(i) {
    grad_g <- numDeriv::jacobian(f.HR1, beta1, time = i)
    sqrt(grad_g %*% v1 %*% t(grad_g))
})

ci.lower.1 <- exp(log_hazard_ratio1 - 1.96 * SE_time1)
ci.upper.1 <- exp(log_hazard_ratio1 + 1.96 * SE_time1)


v2 <- vcov(fit2)

# this is the time dependent hazard ratio function
f.HR2 <- function(beta, time){
    return(beta[2] + (beta[6] + beta[7] + beta[8])*time)
}

# Standard errors for each time
SE_time2 <- sapply(newtime, function(i) {
    grad_g <- numDeriv::jacobian(f.HR2, beta2, time = i)
    sqrt(grad_g %*% v2 %*% t(grad_g))
})


ci.lower.2 <- exp(log_hazard_ratio2 - 1.96 * SE_time2)
ci.upper.2 <- exp(log_hazard_ratio2 + 1.96 * SE_time2)

# Plot Hazard Ratio and Confidence Interval -------------------------------


plot(newtime,
     exp(log_hazard_ratio1),
     type = "l", lty = 1, lwd = 2,
     ylim = range(ci.lower.1,ci.upper.1),
     ylab = "Hazard Ratio for trt",
     main = "Linear interaction: status ~ trt*eventtime")
lines(newtime, ci.lower.1, lty = 2, lwd = 2, col = "grey")
lines(newtime, ci.upper.1, lty = 2, lwd = 2, col = "grey")

plot(newtime,
     exp(log_hazard_ratio2),
     type = "l", lty = 1, lwd = 2,
     ylim = range(ci.lower.2,ci.upper.2),
     ylab = "Hazard Ratio for trt",
     main = "NonLinear interaction: status ~ trt*nsx(eventtime,df=3)")
lines(newtime, ci.lower.2, lty = 2, lwd = 2, col = "grey")
lines(newtime, ci.upper.2, lty = 2, lwd = 2, col = "grey")

Created on 2020-05-28 by the reprex package (v0.3.0)

The issue with the splines was not properly projecting the newtimes using the original basis functions. This can be accomplished using a combination of model.frame and model.matrix. This is actually what predict.lm does. It calls model.frame first which has a predvars attribute containing the knots and boundary knots of the basis functions. Then it calls model.matrix to create the design matrix. Note that this works for any time of transformation, including the linear interaction example (i.e. no splines).

The next step is to try to implement this in a function. Because currently, we are manually zero'ing out the beta coefficients so that it's dimension matched that of the model.matrix

# load packages and data --------------------------------------------------

pacman::p_load(rstpm2) # for nsx function
pacman::p_load(splines)
pacman::p_load(numDeriv) # Load numerical derivative package
devtools::load_all("~/git_repositories/casebase/")
#> Loading casebase
#> See example usage at http://sahirbhatnagar.com/casebase/
data("simdat") # from casebase package
str(simdat)
#> 'data.frame':    2000 obs. of  4 variables:
#>  $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ eventtime: num  3.38 2.89 2.76 5 2.67 ...
#>  $ status   : int  1 1 1 0 1 1 0 1 1 0 ...
#>  $ trt      : int  0 0 0 1 0 0 0 0 0 1 ...


# Fit casebase models -----------------------------------------------------
# with splines ------------------------------------------------------------
fit2 <- fitSmoothHazard(status ~ trt*nsx(eventtime,df=3),
                        data = simdat,
                        time = "eventtime")
summary(fit2)
#> 
#> Call:
#> glm(formula = formula, family = binomial, data = sampleData)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -0.2028  -0.1649  -0.1387  -0.1098   3.5298  
#> 
#> Coefficients:
#>                             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)                  -2.5624     0.1740 -14.724  < 2e-16 ***
#> trt                          -1.0069     0.3133  -3.214  0.00131 ** 
#> nsx(eventtime, df = 3)1       0.9014     0.1680   5.366 8.06e-08 ***
#> nsx(eventtime, df = 3)2       2.2029     0.4013   5.490 4.03e-08 ***
#> nsx(eventtime, df = 3)3       1.0072     0.1427   7.056 1.72e-12 ***
#> trt:nsx(eventtime, df = 3)1   0.4530     0.2642   1.714  0.08648 .  
#> trt:nsx(eventtime, df = 3)2   1.1865     0.6998   1.695  0.08998 .  
#> trt:nsx(eventtime, df = 3)3   0.5842     0.2117   2.759  0.00580 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 13633  on 122714  degrees of freedom
#> Residual deviance: 13343  on 122707  degrees of freedom
#> AIC: 13359
#> 
#> Number of Fisher Scoring iterations: 8


# generate sequence of times for which to calculate hazard ratio ------------
newtime <- quantile(fit2[["data"]][[fit2[["timeVar"]]]], probs = seq(0.05, 0.95, 0.01))
newdata <- data.frame(trt = 1, eventtime = newtime, offset = 0)
newdata.ref <- data.frame(trt = 0, eventtime = newtime, offset = 0)


bX2 <- predict(fit2, type = "terms",
               newdata = newdata,
               terms = c("trt","trt:nsx(eventtime, df = 3)"))

beta2 <- coef(fit2)
mm2 <- model.matrix(fit2)
avx2 <- colMeans(mm2)
int_names <- c("trt",paste0("trt:",paste0("nsx(eventtime, df = 3)",1:3))) # variables needed for HR
add_back2 <- sum(avx2[which(names(avx2) %in% int_names)] * beta2[which(names(avx2) %in% int_names)])

# this is the hazard ratio as a function of time
log_hazard_ratio2 <- rowSums(bX2) + add_back2

# sanity check that we're getting same hazard ratio as estimate_hazard
all.equal(exp(log_hazard_ratio2),
          exp(estimate_hazard(fit2, newdata = newdata)) /
              exp(estimate_hazard(fit2, newdata = newdata.ref)))
#> [1] TRUE

object <- fit2
tt <- terms(object)
Terms <- delete.response(tt)
v2 <- vcov(fit2)

# this is the time dependent hazard ratio function
f.HR2 <- function(beta, time){
    beta[-c(2,6:8)] <- 0
    m <- model.frame(Terms,
                     data = data.frame(trt = 1, eventtime = time, offset = 0),
                     na.action = na.pass,
                     xlev = object$xlevels)
    
    X <- model.matrix(Terms, m, contrasts.arg = object$contrasts)
    sum(X * beta)
}

# Standard errors for each time
SE_time2 <- sapply(newtime, function(i) {
    grad_g <- numDeriv::jacobian(f.HR2, beta2, time = i)
    sqrt(grad_g %*% v2 %*% t(grad_g))
})

ci.lower.2 <- exp(log_hazard_ratio2 - 1.96 * SE_time2)
ci.upper.2 <- exp(log_hazard_ratio2 + 1.96 * SE_time2)

# Plot Hazard Ratio and Confidence Interval -------------------------------

plot(newtime,
     exp(log_hazard_ratio2),
     type = "l", lty = 1, lwd = 2,
     ylim = range(ci.lower.2,ci.upper.2),
     ylab = "Hazard Ratio for trt",
     main = "NonLinear interaction: status ~ trt*nsx(eventtime,df=3)")
lines(newtime, ci.lower.2, lty = 2, lwd = 2, col = "grey")
lines(newtime, ci.upper.2, lty = 2, lwd = 2, col = "grey")

Created on 2020-05-31 by the reprex package (v0.3.0)

Really nice! I was thinking about the same problem, of setting stuff equal to zero programmatically, and I think you do it the following way:

f.HR2 <- function(beta, time){
    m1 <- model.frame(Terms,
                      data = data.frame(trt = 1, eventtime = time, offset = 0),
                      na.action = na.pass,
                      xlev = object$xlevels)
    m0 <- model.frame(Terms,
                      data = data.frame(trt = 0, eventtime = time, offset = 0),
                      na.action = na.pass,
                      xlev = object$xlevels)
    
    X1 <- model.matrix(Terms, m1, contrasts.arg = object$contrasts)
    X0 <- model.matrix(Terms, m0, contrasts.arg = object$contrasts)
    
    sum((X1 - X0) * beta)
}

Yes. Your approach works and is much faster!

# load packages and data --------------------------------------------------

pacman::p_load(rstpm2) # for nsx function
pacman::p_load(splines)
pacman::p_load(numDeriv) # Load numerical derivative package
pacman::p_load(bench)
devtools::load_all("~/git_repositories/casebase/")
#> Loading casebase
#> See example usage at http://sahirbhatnagar.com/casebase/
data("simdat") # from casebase package
str(simdat)
#> 'data.frame':    2000 obs. of  4 variables:
#>  $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ eventtime: num  3.38 2.89 2.76 5 2.67 ...
#>  $ status   : int  1 1 1 0 1 1 0 1 1 0 ...
#>  $ trt      : int  0 0 0 1 0 0 0 0 0 1 ...

# Fit casebase models -----------------------------------------------------
# with splines ------------------------------------------------------------
fit2 <- fitSmoothHazard(status ~ trt*nsx(eventtime,df=3),
                        data = simdat,
                        time = "eventtime")
summary(fit2)
#> 
#> Call:
#> glm(formula = formula, family = binomial, data = sampleData)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -0.2004  -0.1651  -0.1388  -0.1096   3.5271  
#> 
#> Coefficients:
#>                             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)                  -2.5619     0.1740 -14.723  < 2e-16 ***
#> trt                          -0.9956     0.3128  -3.183  0.00146 ** 
#> nsx(eventtime, df = 3)1       0.9365     0.1680   5.574 2.49e-08 ***
#> nsx(eventtime, df = 3)2       2.1835     0.4013   5.441 5.29e-08 ***
#> nsx(eventtime, df = 3)3       0.9913     0.1428   6.942 3.88e-12 ***
#> trt:nsx(eventtime, df = 3)1   0.4310     0.2644   1.630  0.10304    
#> trt:nsx(eventtime, df = 3)2   1.1709     0.6989   1.675  0.09388 .  
#> trt:nsx(eventtime, df = 3)3   0.5752     0.2116   2.718  0.00657 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 13633  on 122714  degrees of freedom
#> Residual deviance: 13345  on 122707  degrees of freedom
#> AIC: 13361
#> 
#> Number of Fisher Scoring iterations: 8

# generate sequence of times for which to calculate hazard ratio ------------
newtime <- quantile(fit2[["data"]][[fit2[["timeVar"]]]], probs = seq(0.05, 0.95, 0.01))
newdata <- data.frame(trt = 1, eventtime = newtime, offset = 0)
newdata.ref <- data.frame(trt = 0, eventtime = newtime, offset = 0)


bX2 <- predict(fit2, type = "terms",
               newdata = newdata,
               terms = c("trt","trt:nsx(eventtime, df = 3)"))

beta2 <- coef(fit2)
mm2 <- model.matrix(fit2)
avx2 <- colMeans(mm2)
int_names <- c("trt",paste0("trt:",paste0("nsx(eventtime, df = 3)",1:3))) # variables needed for HR
add_back2 <- sum(avx2[which(names(avx2) %in% int_names)] * beta2[which(names(avx2) %in% int_names)])

# this is the hazard ratio as a function of time
log_hazard_ratio2 <- rowSums(bX2) + add_back2

# sanity check that we're getting same hazard ratio as estimate_hazard
all.equal(exp(log_hazard_ratio2),
          exp(estimate_hazard(fit2, newdata = newdata)) /
              exp(estimate_hazard(fit2, newdata = newdata.ref)))
#> [1] TRUE


object <- fit2
tt <- terms(object)
Terms <- delete.response(tt)
v2 <- vcov(fit2)


# MANUAL WAY this is the time dependent hazard ratio function ------------------
f.manual <- function(beta, time){
    beta[-c(2,6:8)] <- 0
    m <- model.frame(Terms,
                     data = data.frame(trt = 1, eventtime = time, offset = 0),
                     na.action = na.pass,
                     xlev = object$xlevels)
    
    X <- model.matrix(Terms, m, contrasts.arg = object$contrasts)
    sum(X * beta)
}

# Standard errors for each time
SE_manual <- sapply(newtime, function(i) {
  grad_g <- numDeriv::jacobian(f.manual, beta2, time = i)
  sqrt(grad_g %*% v2 %*% t(grad_g))
})


# AUTOMATIC WAY to calculate time depedent hazard ratio function ------------------

f.automatic <- function(beta, time){
  m1 <- model.frame(Terms,
                    data = data.frame(trt = 1, eventtime = time, offset = 0),
                    na.action = na.pass,
                    xlev = object$xlevels)
  m0 <- model.frame(Terms,
                    data = data.frame(trt = 0, eventtime = time, offset = 0),
                    na.action = na.pass,
                    xlev = object$xlevels)
  
  X1 <- model.matrix(Terms, m1, contrasts.arg = object$contrasts)
  X0 <- model.matrix(Terms, m0, contrasts.arg = object$contrasts)
  
  sum((X1 - X0) * beta)
  
}

# Standard errors for each time
SE_automatic <- sapply(newtime, function(i) {
  grad_g <- numDeriv::jacobian(f.automatic, beta2, time = i)
  sqrt(grad_g %*% v2 %*% t(grad_g))
})



# AUTOMATIC FAST WAY to calculate time depedent hazard ratio function ------------------

f.automaticFAST <- function(beta, time){
  m1 <- model.frame(Terms,
                    data = data.frame(trt = 1, eventtime = time, offset = 0),
                    na.action = na.pass,
                    xlev = object$xlevels)
  m0 <- model.frame(Terms,
                    data = data.frame(trt = 0, eventtime = time, offset = 0),
                    na.action = na.pass,
                    xlev = object$xlevels)
  
  X1 <- model.matrix(Terms, m1, contrasts.arg = object$contrasts)
  X0 <- model.matrix(Terms, m0, contrasts.arg = object$contrasts)
  
  # this is the jacobian!!
  (X1 - X0)
  
}

jacobi <- f.automaticFAST(beta2, time = newtime)
SE_automatic_FAST <- sqrt(diag(jacobi %*% tcrossprod(v2, jacobi)))



# Compare Timings -----------------------------------------------------------------------

res <- bench::mark(
  manual = sapply(newtime, function(i) {
    grad_g <- numDeriv::jacobian(f.manual, beta2, time = i)
    sqrt(grad_g %*% v2 %*% t(grad_g))}),
  
  automatic = sapply(newtime, function(i) {
    grad_g <- numDeriv::jacobian(f.automatic, beta2, time = i)
    sqrt(grad_g %*% v2 %*% t(grad_g))}),
  
  autoFAST = {
    jacobi <- f.automaticFAST(beta2, time = newtime)
    SE_automatic_FAST <- sqrt(diag(jacobi %*% tcrossprod(v2, jacobi)))
  },
  
  check = TRUE, # checks that all results are equal
  iterations = 10
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.


res
#> # A tibble: 3 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 manual        4.17s    4.31s     0.228    5.69MB     3.92
#> 2 automatic     8.61s    8.74s     0.114   11.28MB     3.60
#> 3 autoFAST     1.51ms   1.52ms   651.     264.24KB     0
ggplot2::autoplot(res)
#> Loading required namespace: tidyr

Created on 2020-06-10 by the reprex package (v0.3.0)

Here's a first attempt at wrapping all of the above in a function. Following the rstpm2 package, the newdata argument is the reference category. And the exposed argument by default will increment the var argument by 1 using the incrVar internal function. Else the user can supply it's own function to the exposed argument as shown below:

pacman::p_load(rstpm2) # for nsx function
pacman::p_load(splines)
devtools::load_all("~/git_repositories/casebase/")
#> Loading casebase
#> See example usage at http://sahirbhatnagar.com/casebase/
data("simdat") # from casebase package


fit2 <- fitSmoothHazard(status ~ trt*nsx(eventtime,df=3),
                        data = simdat,
                        time = "eventtime")

newtime <- quantile(fit2[["data"]][[fit2[["timeVar"]]]], probs = seq(0.05, 0.95, 0.01))

par(mfrow = c(1,2))

#by default this will increment `var` by 1 for exposed category
plot(fit2, 
     type = "hr", 
     newdata = data.frame(trt = 0, eventtime = newtime),
     var = "trt", 
     xvar = "eventtime", 
     ci = TRUE)

# user defined exposed data
plot(fit2, 
     type = "hr", 
     newdata = data.frame(trt = 0, eventtime = newtime),
     exposed = function(data) transform(data, trt = 1),
     xvar = "eventtime", 
     ci = TRUE)

Created on 2020-06-13 by the reprex package (v0.3.0)

Really nice, they look great!