predict(model, newdata) fails due to incorrect call capturing by lme(), nlme(), and gnls()

Question

predict(model, newdata) fails due to incorrect call capturing by lme(), nlme(), and gnls()

Closed this issue 6 months ago · 4 comments

Not sure this is of interest or if there's any hope of nlme fixes made here ever making into the CRAN version but, seeing as I've ended up using this version of nlme rather than the CRAN version (thanks for the weighting fix!) and am not aware of nlme code motion elsewhere, I thought it might be worth capturing this nlme issue here. Problems after calling lme() from within a function were noted on StackOverflow in 2012 and it appears nlme call capturing may have been unreliable for at least 20 years.

lme(), nlme(), gnls(), and possibly other parts of nlme capture formulas passed to them in a $call property. If these nlme functions are called from within a user defined function, for example because models are being cross validated, their $call stores the function arguments. When $call is accessed later, by predict.lme(model, newdata), predict.nlme(model, newdata), predict.gnls(model, newdata) and possibly in in other call graphs internal to nlme, predict() fails because the scope is no longer the original function call and $call's fields point to obects which are no longer accessible.

A workaround is to use do.call(nlme, list([named list of arguments])) to force copying of arguments and hence get nlme to set durably reusable values into $call's fields. If this isn't done errors like does not exist or model.frame.default() "invalid type (language) for variable" result. Since this workaround isn't readily discoverable and the error messages are cryptic, if someone has time to put a PR together nlme improvements in this area seem likely to be helpful.

Answer 1 · 2023-03-22T21:40:28.000Z

I haven't looked into what's already done, but a standard approach that can help with this is to evaluate $call in the environment of the formula - this can still break, but does handle a lot of cases.

A reproducible example would be handy.

You should probably submit this to the R bugs list, that's the appropriate place for it (I can submit on your behalf if you don't already have access and don't feel like requesting it).

More generally, I really ought to talk to R-core about the general issue of nlme maintenance/development.

There is a SVN repository for nlme (svn co https://svn.r-project.org/R-packages/trunk/nlme nlme_svn). In principle I could create a branch in SVN, but I don't know if SVN has something corresponding to a Github fork (i.e., to allow me to maintain my branch and sync it with the upstream stuff without having write access to the original repo ... I guess this is possible ?)
There are 19 open bugs on Bugzilla; at least one of them is peripheral; one is my patch for rank-deficient fixed effects, and another is the weights issue you mention above ...
Martin Maechler, Brian Ripley, and Sebastian Meyer are the active developers

Answer 2 · 2023-04-07T17:31:53.000Z

Hi Ben, thank you for the reply. Sorry to take a bit to respond in turn―just been too many things trying to happen at once.

A repex was posted to StackOverflow in 2012. It's linked in the OP. A real world example is fit_nlme().
I'm not getting search hits for an R bugs list. According to the docs one is supposed to use bug.report even though that's not supported in RStudio. From RGui, bug.report(package = "nlme") goes to https://bugs.r-project.org/, same as your link. bugs.r-project.org wants a log in to file an issue, which is reasonable enough, but exposes no option to register for an account that I can find.
This issue also appears to affect mgcv::gamm()'s underlying call to lme(), resulting in some things breaking if you need to work with a gamm object's $lme field. Since that call's internal to mgcv it doesn't look like there's a workaround for callers—do.call(gamm, list(...)) is not effective.

I don't know if SVN has something corresponding to a Github fork

Another option might be to git fork https://github.com/cran/nlme/ and then rebase over CRAN changes from time to time (this repo doesn't currently show as a fork). Not an approach friendly to generating SVN patches but it does at least (mostly) automate one sync direction.

Answer 3 · 2023-04-07T21:05:31.000Z

Very obscurely, if you go to https://www.r-project.org/bugs.html (which is linked from https://bugs.r-project.org/, but not prominently ...) you can find this:

NOTE: due to abuse by spammers, since 2016-07-09 only “members” (including all who have previously submitted bugs) can submit new bugs on R’s Bugzilla. In order to get a bugzilla account (i.e., become “member”), please send an e-mail (from the address you want to use as your login) to bug-report-request@r-project.org briefly explaining why, and a volunteer will add you to R’s Bugzilla members.

As for various forking solutions, I think I'm not going to tackle that right now (ideally I would find a smart minion to do annoying things like that for me ... :-) )

For what it's worth I did write to the organizers of the R Project Sprint (and cc'ing Maechler and Meyer) to suggest that an nlme-improving component could be included ...

Answer 4 · 2024-04-19T12:49:46.000Z

For nlme() and gnls(), this should be fixed via PR#18559 in nlme >= 3.1-163 (released 2023-08-09). Similar scoping issues have been fixed before, e.g., PR#15892. A fix for predict.lme() is in the trunk so will be part of nlme >= 3.1-165.

PS: I think this issue should be closed. More generally, as Ben said, R's Bugzilla is the right place to report nlme bugs (with minimal reproducible examples), but yes, the response rate for these is rather slow (small number of volunteers) ...