therneau/survival

"Special trickery for matched case-control data" and ymax in concordancefit()

Opened this issue · 2 comments

Within concordancefit(), the "Special trickery for matched case-control data" involves changing the y to create disjoint time intervals, but then the ymax argument (and presumably also ymin) do not have the intended effect.

For example, in all the following ymax > max(dfr$ptime) so expect same result as without setting ymax, but the correct result is returned only when keepstrata = TRUE (and the "special trickery" is not used).

library(survival)

dfr <- mgus2[!is.na(mgus2$mspike),]

## create 12 strata
dfr$stratum <- cut(dfr$age, breaks = c(0, seq(45, 100, 5)))

## fit models
fit1 <- coxph(Surv(ptime, pstat) ~ mspike + strata(stratum),
              dfr)

## without ymax
concordance(Surv(ptime, pstat) ~ predict(fit1) + strata(stratum),
            data = dfr)

## with ymax
concordance(Surv(ptime, pstat) ~ predict(fit1) + strata(stratum),
            data = dfr,
            keepstrata = TRUE,
            ymax = 500)

concordance(Surv(ptime, pstat) ~ predict(fit1) + strata(stratum),
            data = dfr,
            keepstrata = FALSE,
            ymax = 500)

concordance(Surv(ptime, pstat) ~ predict(fit1) + strata(stratum),
            data = dfr,
            keepstrata = TRUE,
            ymax = 2000)

concordance(Surv(ptime, pstat) ~ predict(fit1) + strata(stratum),
            data = dfr,
            keepstrata = FALSE,
            ymax = 2000)

Good catch. I hadn't thought about ymax/ymin when doing the 'trickery' code.
In my defense, I can't think of a case where I would want to use ymin or ymax in case-control data. Could you give some context, just for my education? (It's still a bug.)

I found this when working with survival data, with >10 strata, rather than case-control.