stan-dev/rstan

Various hard to track / repeat crashes on different machines with stanheaders 2.26.26/27

cdriveraus opened this issue · 10 comments

I am seeing issues on different pc's, operating systems, and software packages, ranging from complete freezing to weird errors compiling to rstudio sessions crashing, using R 4.2 / 4.3 . Specifically, software packages bigIRT and ctsem, which both use rstan for computing the log probability / gradient of fairly large stan models, and sometimes use the sampling procedure. Sorry for the terribly vague report I am trying to track the occurrences to make sense of them and trying to make sense of users reports, but I wanted to post something about it -- is there some awareness of the issues, are others facing similar problems?
As best as I can tell the problems started with 2.26.26 , but maybe some were there already with .25 I can't be sure.

Sometimes the issues only happen after analyzing data for a day, in other cases the crash is repeatable (for the specific machine / os / R version configuration) and happens quickly.

Is this both rstan and StanHeaders 2.26, or rstan 2.21 and StanHeaders 2.26?

Can you share a reproducible example?

I'm not sure if it is with rstan 2.26, will have to check.
On windows 11 with R 4.31 / rtools 4.3, stanheaders 2.26.27 and rstan 2.21.8, the following was reliably crashing the r session:

remotes::install_github('cdriveraus/ctsem', INSTALL_opts = "--no-multiarch")
library(ctsem)
set.seed(1)
s=list()
nsubjects=500
Tpoints=15
parsd=1.4
parmu= -3.4
dt=1
par= (rnorm(nsubjects,parmu,parsd))
mean(par)
sd(par)

for(subi in 1:nsubjects){
  gm=suppressMessages(ctModel(LAMBDA=diag(1), Tpoints=Tpoints, DRIFT=matrix(-.5),T0MEANS = matrix(4), 
    CINT=matrix(par[subi]),DIFFUSION=matrix(1),
    T0VAR=matrix(1), MANIFESTVAR=matrix(.3)))
  d=suppressMessages(ctGenerate(gm,n.subjects = 1,burnin = 0,dtmean = dt))
  if(subi==1) dat=cbind(subi,d) else dat=rbind(dat,cbind(subi,d))
}

colnames(dat)[1]='id'

dm <- ctModel(LAMBDA=diag(1), type='standt',
  CINT=matrix('cint'),
  MANIFESTMEANS = matrix(0)
)

  
  f = ctStanFit(datalong = dat,ctstanmodel = dm,optimize=TRUE,
    verbose=0,savescores = FALSE,priors=FALSE)
  

If you reinstall rstan 2.21 from source does it still crash?

jgabry commented

I think @ssp3nc3r and others have also been dealing with crashes on Mac using the latest CRAN versions of R, RStan and StanHeaders

If they can share a reproducible example than I can debug locally on Mac.

I can't reproduce the ctsem crash above on Mac unfortunately

jgabry commented

I think for @ssp3nc3r every Stan program was crashing, is that right Scott?

jgabry commented

Hee’s another report of crashing on a Mac with the latest CRAN versions of everything:

https://discourse.mc-stan.org/t/example-in-rstan-sampling-crashes-r-completely-on-mac-osx/27873/3

Unfortunately no Stan programs provided, but sounds like it was happening for every program like in @ssp3nc3r’s case

I'm also seeing now that the only setup passing tests when building and testing ctsem on github actions is the one that upgrades to rstan 2.26 . https://github.com/cdriveraus/ctsem/actions/runs/5455364042/jobs/9926706393

but still getting complete pc freezing when running for longer periods using bigIRT, with rstan 2.26...

The tricky thing with 2.26 is that rstan 2.21 needs to be rebuilt against StanHeaders 2.26, otherwise it will segfault. Additionally 2.26 has a lot of custom IFDEFs and branching for compatibility, so things are likely to be much more stable once 2.32 is up.

@bgoodri what do you think? I think it would be best to get 2.32 up as soon as possible and be debugging/troubleshooting that, rather than debugging/troubleshooting 2.21/2.26 with issues that might not even be present with the newer version