jwood000/RcppAlgos

clang-ASAN gcc-ASAN & valgrind issues

jwood000 opened this issue · 2 comments

There are serious memory issues reported by CRAN (https://CRAN.R-project.org/package=RcppAlgos) for version 2.3.3 on 2019-06-30.

We have identified and corrected all issues. They were caused by the following lines in GeneralPartitions.h:

while (z[maxIndex - 1] == currMax)

while (z[maxIndex - 1] == currMax)

while (z[maxIndex] - z[edge] < 2)

for (int i = 0; i < r; ++i) {

for (int i = 0; i < r; ++i) {

In all cases, we would eventually try to access memory that we don't own (E.g. with an index of -1 or one past the last element).

We identified these issues and confirmed their correctness by following the instruction outlined by Brian Knaus found here: (https://knausb.github.io/2017/06/reproducing-a-clang-ubsan-issue/). We also gained valuable insight from Dirk's post on this issue (http://dirk.eddelbuettel.com/code/sanitizers.html).

First, we start docker.

Josephs-MBP:~ josephwood$ docker-machine start default
Starting "default"...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Machine "default" was started.
Waiting for SSH to be available...
Detecting the provisioner...
Started machines may have new IP addresses. You may need to re-run the `docker-machine env` command.

Now, we have set our environment variables as explained here: https://stackoverflow.com/a/40040077/4408538

Josephs-MBP:~ josephwood$ eval $(docker-machine env)

Pull an updated image from docker

Josephs-MBP:~ josephwood$ docker pull rocker/r-devel-ubsan-clang

Start a docker container ensuring we have enough memory (I.e. -m=10g). We also need enable ptrace: --cap-add SYS_PTRACE otherwise you will get something like 'LeakSanitizer does not work under ptrace'. N.B. In order for this to run all tests, I had to comment out any test that expected errors.

Josephs-MBP:~ josephwood$ docker run -m=10g --cap-add SYS_PTRACE --name=r-devel-ubsan-clang -v ~/RcppAlgos:/RSource/RcppAlgos --rm -ti rocker/r-devel-ubsan-clang /bin/bash
root@228d8325fa36:/#

Add the relevant libraries

root@228d8325fa36:/# apt-get update
.
.

root@228d8325fa36:/# apt-get install libgmp3-dev pandoc
Reading package lists... Done
Building dependency tree  
.
.

Install all required R packages (this takes a bit for me)

root@228d8325fa36:/# R -e 'install.packages(c("gmp", "Rcpp", "numbers", "bigIntegerAlgos", "RcppThread", "testthat", "microbenchmark"), dependencies = TRUE, lib = "/usr/local/lib/R/site-library")'

Build our package

root@228d8325fa36:/# R CMD build RSource/RcppAlgos
* checking for file ‘RSource/RcppAlgos/DESCRIPTION’ ... OK
.

Check the package:

root@228d8325fa36:/# Rdevel CMD check --no-examples RcppAlgos_2.3.3.tar.gz 

The check bombs and looking in RcppAlgos.RCheck/tests/testthat.Rout.fail

root@228d8325fa36:/# vi RcppAlgos.RCheck/tests/testthat.Rout.fail

... we see exactly what is reported on CRAN:

.
.
.

> library(testthat)
> library(RcppAlgos)
> 
> test_check("RcppAlgos")
=================================================================
==42603==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60800000b9f8 at pc 0x7facbd47acbf bp 0x7ffd9b3695d0 sp 0x7ffd9b3695c8
READ of size 8 at 0x60800000b9f8 thread T0
    #0 0x7facbd47acbe in Partitions::PartitionsDistinct(int, int, std::__1::vector<long, std::__1::allocator<long> > const&, long, int, int, int, bool, std::__1::vector<long, std::__1::allocator<long> >&) /data/gannet/ripley/R/packages/tests-clang-SAN/RcppAlgos/src/../inst/include/GeneralPartitions.h:345:24
    #1 0x7facbd4a21bb in Rcpp::Matrix<13, Rcpp::PreserveStorage> Partitions::GeneralPartitions<Rcpp::Matrix<13, Rcpp::PreserveStorage> >(int, int, std::__1::vector<long, std::__1::allocator<long> >&, long, bool, double, bool, bool, bool) /data/gannet/ripley/R/packages/tests-clang-SAN/RcppAlgos/src/../inst/include/GeneralPartitions.h:600:22
    #2 0x7facbd482afa in CombinatoricsRcpp(SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, bool, SEXPREC*, bool, bool, SEXPREC*, SEXPREC*, SEXPREC*, SEXPREC*, int, SEXPREC*) /data/gannet/ripley/R/packages/tests-clang-SAN/RcppAlgos/src/Combinatorics.cpp:544:36

Changing the offending lines above as follows:

while (maxIndex > 0 && z[maxIndex - 1] == currMax)

while (maxIndex > 0 && z[maxIndex - 1] == currMax)

while (edge > 0 && z[maxIndex] - z[edge] < 2)

for (int i = 0; i < lastCol; ++i) {

for (int i = 0; i < lastCol; ++i) {

And running the docker container as before we obtain:

root@228d8325fa36:/# Rdevel CMD check --no-examples RcppAlgos_2.3.4.tar.gz 

And here is the output of RcppAlgos.RCheck/tests/testthat.Rout.fail

.
.
.
r some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(testthat)
> library(RcppAlgos)
>
> test_check("RcppAlgos")
══ testthat results  ═══════════════════════════════════════════════════════════
OK: 521 SKIPPED: 0 WARNINGS: 0 FAILED: 0
>
> proc.time()
   user  system elapsed
 40.380   2.090  46.797
=================================================================
==11920==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 99504 byte(s) in 12438 object(s) allocated from:
    #0 0x4c6143 in __interceptor_malloc (/usr/local/lib/R/bin/exec/R+0x4c6143)
    #1 0x7f2efe75a8f8 in __gmp_default_allocate (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0xc8f8)
    #2 0x7f2efe47da07 in _RcppAlgos_CombinatoricsRcpp /RcppAlgos.Rcheck/00_pkg_src/RcppAlgos/src/RcppExports.cpp:43:34
    #3 0x7f2f06ee234e in R_doDotCall /tmp/R-devel/src/main/dotcode.

Note the junk at the bottom. In my experience these aren't real memory issues as typically, when there is a real memory issue, the tests bombs and nothing is reported (i.e. is no validation that the tests completed). For example, the fact that we have OK: 521 SKIPPED: 0 WARNINGS: 0 FAILED: 0 is confirmation that we addressed the real memory issues.

New Procedures Moving Forward

One of the goals of this release was in response to a clang-UBSAN issue which we have encountered before. It was occurring when a constraint was applied in conjunction with preparing our limitConstraint vector for integer returns without first checking if these values would fit into an integer vector.

Because of the serious significance of these issues, the potentially catastrophic results, and the frequent oversight by my development process, we will no longer simply rely on win-builder, rhub, Travis-CI, covr, and a determined spirit of scouring every line of code before submission.

In addition to the checks above, we will consult docker frequently and address issues more aggressively.

Regards,
Joseph Wood