szcf-weiya/fRLR

warning: stack imbalance

szcf-weiya opened this issue · 5 comments

email from CRAN

On 16/07/2021 09:28, Prof Brian Ripley wrote:

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_fRLR.html.

Please correct before 2021-07-30 to safely retain your package on CRAN.

The CRAN Team

This started with Rcpp's update to 1.0.7, and seems to be (on most platforms) corrupting R's memory allocations which has made it impossible so far to get a handle on it.

--
Brian D. Ripley, ripley@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

and the checking details are https://cran.r-project.org/web/checks/check_results_fRLR.html

checking re-building of vignette outputs ... [30m/35m] WARNING
Error(s) in re-building vignettes:
--- re-building ‘fRLR.Rnw’ using Sweave

first attempt: replace Sweave with rmarkdown+roxygen2

not aware of the core issue, and tried to use the modern documentation tool, 0338e95

then resubmit but cannot pass the test, see https://win-builder.r-project.org/incoming_pretest/fRLR_1.2_20210716_175519/Debian/00check.log for more details, same issue

* checking re-building of vignette outputs ... [0m/30m] WARNING
Error(s) in re-building vignettes:
  ...
--- re-building ‘fRLR.Rmd’ using rmarkdown
Warning: stack imbalance in 'withVisible', 48 then 228

* checking PDF version of manual ... OK
* checking for non-standard things in the check directory ... OK
* checking for detritus in the temp directory ... OK
* DONE
Status: 1 WARNING

dig into

  • what is stack imbalance?

A stack imbalance occurs when the data structure used to keep track of called functions, arguments, and return values becomes corrupted or misaligned.

Most times, the stack is a memory pointer that stores the address where control will resume when the current function call exits back to the caller. There are different variants on this, sometimes the arguments to a function are also appended to the stack, as well as the return value. What is most important here is that the caller and callee should agree upon how to restore it back to the prior state when the callee exits. This agreement is frequently known as the Calling Convention.

source: https://stackoverflow.com/a/6779895

potentially similar issue:

When R reports a stack imbalance, it means the internal protection stack (which is used to protect R objects on the stack from the garbage collector) is unbalanced, indicating that something got out of sync in some C / C++ routine somewhere. This is unrelated to calling conventions.

Rcpp generally manages protection of its objects outside of the protection stack; it explicitly uses the R_PreserveObject() and R_ReleaseObject() APIs, which are not stack-based.

note that the term R_PreserveObject() and R_ReleaseObject() also appeared in the change log of latest rcpp 1.0.7 (https://github.com/RcppCore/Rcpp/releases/tag/1.0.7), and that reminds me of the comment from Prof. Ripley

This started with Rcpp's update to 1.0.7, and seems to be (on most platforms) corrupting R's memory allocations which has made it impossible so far to get a handle on it.

also checked other possible similar issues,

but no idea on how to solve it.

reproduce

At first, only observe the warning in the last code chunk

fRLR/vignettes/fRLR.Rmd

Lines 203 to 215 in 0338e95

set.seed(123)
n = 100
X = matrix(rnorm(10*n), 10, n)
Y = rnorm(10)
COV = matrix(rnorm(40), 10, 4)
#idx1 = c(1, 2, 3, 4, 1, 1, 1, 2, 2, 3)
#idx2 = c(2, 3, 4, 5, 3, 4, 5, 4, 5, 5)
id = combn(n, 2)
idx1 = id[1, ]
idx2 = id[2, ]
system.time(frlr2(X, idx1, idx2, Y, COV))

  |......................................................................| 100%
label: unnamed-chunk-7

Warning: stack imbalance in 'withVisible', 37 then 38

then I thought it might be some problems in frlr2, but occasionally, I found that the warning does not appear every time, and also found that the warning can appear for frlr1 if I repeat the procedure several times, that was the same phenomenon as described in

Note that the behavior is eratic: sometimes, I have no warning at all, sometimes it is at the first run
source: https://stackoverflow.com/questions/27523979/stack-imbalance-with-rcppparallel

the warning can be easily observed by repeating to run the function several times
image

freq of frlr2

with # pragma omp parallel for schedule(dynamic)

> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 1
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 5
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 1
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 5
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 4
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3

comment // # pragma omp parallel for schedule(dynamic)

> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)

with # pragma omp parallel for

> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 1
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then -1
> a = frlr2(X, idx1, idx2, Y, COV)
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3
> a = frlr2(X, idx1, idx2, Y, COV)
Warning: stack imbalance in '=', 2 then 3

thus, the issue is caused by openmp.

similar issue in the mail list

when I am preparing to write an issue to https://github.com/RcppCore/Rcpp/issues/new
There is a detailed instruction, and I then find the related mail list by typing

site:lists.r-forge.r-project.org rcpp-devel stack imbalance warning

although old, and seems only 3, so I checked the more recent one (2014), https://lists.r-forge.r-project.org/pipermail/rcpp-devel/2014-March/007338.html. The general recommendation by Dirk is

OpenMP is by design multithreaded. R is very famously not set up for that.
By calling back into R, you set yourself up for trouble,

Just accessing / setting data structures should work (but test first...),
RNGs is clearly asking for trouble.

In short, for OpenMP use plain C++ constructs. But not R. Use R and and Rcpp
to get your data to your OpemMP code portions, run those (carefully) in
vanilla C++ (or even vanilla C++11) and then return to R.

then also check another mail, https://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-July/002639.html, typing previous and next for more messages.

Similar guidelines have been posted, but the R variables in my loop are just by visiting/accessing, should be OK, just as the example omp3 on https://scholar.princeton.edu/sites/default/files/q-aps/files/slides_day4_pm.pdf

Note that

So far, as long as the Rcpp stuff happens in the loop directly (same function), it works. But if I call another function from the parallelized loop, and do some Rcpp work in that function, that's when I get problems.

so I am guessing maybe just to replace the R variables in the calling function. So only need to convert X from NumericVector to vector<double>

Then it works!