therneau/survival

std.err missing from concordance() output object

Closed this issue · 2 comments

survConcordance() has been telling me its deprecated for some time now, and I'm told to switch to concordance()

The problem I have with this is that the outputs are not quite the same:

List of 5
 $ concordance: Named num 0.499
  ..- attr(*, "names")= chr "concordant"
 $ stats      : Named num [1:5] 921 924 0 1 182
  ..- attr(*, "names")= chr [1:5] "concordant" "discordant" "tied.risk" "tied.time" ...
 $ n          : int 71
 $ std.err    : Named num 0.0493
  ..- attr(*, "names")= chr "std(c-d)"
 $ call       : language survival::survConcordance(formula = formula, data = data)
 - attr(*, "class")= chr "survConcordance"

versus

List of 6
 $ concordance: num 0.501
 $ count      : Named num [1:5] 924 921 0 1 0
  ..- attr(*, "names")= chr [1:5] "concordant" "discordant" "tied.x" "tied.y" ...
 $ n          : int 71
 $ var        : num 0.00223
 $ cvar       : num 0.00243
 $ call       : language concordance.formula(object = formula, data = data, std.err = T)
 - attr(*, "class")= chr "concordance"

Comparing these, I find enough similarities:
"concordance" is retained although concordant and discordant are switched between the two.
"n" is retained.
"std.err" is missing in concordance even though it is calculated and displayed in the output (slightly different from the value obtained from survConcordance()):

Call:
concordance.formula(object = formula, data = data, std.err = T)

n= 71 
Concordance= 0.5008 se= 0.04717
concordant discordant     tied.x     tied.y    tied.xy 
       924        921          0          1          0 

Could you add std.err back into the output structure of concordance() or explain how it is to be extracted?

  1. The concordance is formally Pr(y[i] > y[j] | yhat[i] > yhat[j]), where y is a response and yhat the prediction. The concordance function takes this general view, and works for lm, glm, coxph, survreg, .... models. You can even use it with machine learning models: get get the predicted values yhat from the model, and plug that into concordance.

  2. The Cox model is odd, in that a higher value of the linear predictor corresponds to a shorter survival; hence you need the reverse argument. When I wrote survConcordance I was thinking only of Cox models, so I made reverse the default. The new function will do the right thing if it has a hint, i.e., you call it as concordance(coxfit, newdata=..) where the first argument is a coxph fit, but if you use the survival time and predictor directly, you have to tell it. For a survreg model, BTW, a higher linear predictor implies a longer survival; it is not reversed.

  3. If you do cfit <- concordance(....) the easiest and most reliable way to get the parameter is to use coef(cfit) and vcov(cfit), the generic functions that work with almost all modeling functions. They are guaranteed to behave even if I end up relabeling some internal bits.

  4. If you have a data set 'zed' with y, p1, p2, p3; where p1, p2, and p3 are predictions from 3 different models, you can use cfit <- concordance(y ~ p1 + p2 + p3, data=zed). In that case coef(cfit) will be of length 3 and vcov will be 3 by 3. This is used to test whether one model has 'significantly' better concordance than another.

The is a whole vignette on concordance. You might want to read part of it.

Terry