mlr-org/mlrMBO

double free or corruption (!prev)

mattdowle opened this issue · 3 comments

Hi,
I've just submitted data.table 1.12.2 to CRAN. I had no issues with mlrBMO in revdep checking: it passed R CMD check fully OK and wasn't affected by any data.table changes in this release.
However, when CRAN ran its revdep tests, mlrMBO failed with the error "double free or corruption (!prev)". My guess is that this is unlikely due to data.table so I asked CRAN maintainers to proceed to publish data.table. This issue is just to follow up. Here is the reply I received from CRAN checks. After that I've pasted my results when I run mlrMBO through R-devel compiled with ASAN and strict-barrier. There appears to be a memory problem somewhere but I didn't manage to get a line number for you I'm afraid.
HTH, Matt

Package: mlrMBO
Check: tests
New result: ERROR
    Running ‘testthat.R’ [37s/37s]
  Running the tests in ‘tests/testthat.R’ failed.
  Complete output:
    > library(testthat)
    > 
    > # the unit tests take pretty long, that can be a problem on WB and cran (and maybe annoying locally)
    > # so we run all tests only on travis and if a certain user env var is set
    > if (identical(Sys.getenv("TRAVIS"), "true") || identical(Sys.getenv("R_EXPENSIVE_TEST_OK"), "true")) {
    +   test_check("mlrMBO")
    + } else {
    +   test_check("mlrMBO", filter = "((mbo_rf)|(mbo_km))")
    + }
    Loading required package: mlrMBO
    Loading required package: mlr
    Loading required package: ParamHelpers
    Loading required package: smoof
    Loading required package: BBmisc

    Attaching package: 'BBmisc'

    The following object is masked from 'package:base':

        isFALSE

    Loading required package: checkmate
    double free or corruption (!prev)
    Aborted
$ Rdevel-strict CMD check mlrMBO_1.1.2.tar.gz 
* using log directory ‘/home/mdowle/build/revdeplib/mlrMBO.Rcheck’
* using R Under development (unstable) (2019-03-26 r76272)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* checking for file ‘mlrMBO/DESCRIPTION’ ... OK
* this is package ‘mlrMBO’ version ‘1.1.2’
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘mlrMBO’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking compiled code ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’
 ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
      #17 0x562d158dd8cf in Rf_applyClosure /home/mdowle/build/R-devel-strict/src/main/eval.c:1706
      #18 0x562d1590a2c2 in bcEval /home/mdowle/build/R-devel-strict/src/main/eval.c:6733
      #19 0x562d158d91f5 in Rf_eval /home/mdowle/build/R-devel-strict/src/main/eval.c:620
      #20 0x562d158de3f3 in R_execClosure /home/mdowle/build/R-devel-strict/src/main/eval.c:1780
      #21 0x562d158dd8cf in Rf_applyClosure /home/mdowle/build/R-devel-strict/src/main/eval.c:1706
      #22 0x562d158da02b in Rf_eval /home/mdowle/build/R-devel-strict/src/main/eval.c:743
      #23 0x562d158e877a in do_set /home/mdowle/build/R-devel-strict/src/main/eval.c:2807
      #24 0x562d158d998a in Rf_eval /home/mdowle/build/R-devel-strict/src/main/eval.c:695
      #25 0x562d158e68cd in do_begin /home/mdowle/build/R-devel-strict/src/main/eval.c:2382
      #26 0x562d158d998a in Rf_eval /home/mdowle/build/R-devel-strict/src/main/eval.c:695
      #27 0x562d158de3f3 in R_execClosure /home/mdowle/build/R-devel-strict/src/main/eval.c:1780
      #28 0x562d158dd8cf in Rf_applyClosure /home/mdowle/build/R-devel-strict/src/main/eval.c:1706
      #29 0x562d158da02b in Rf_eval /home/mdowle/build/R-devel-strict/src/main/eval.c:743
  
  SUMMARY: AddressSanitizer: 216 byte(s) leaked in 9 allocation(s).
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking running R code from vignettes ...
   ‘mlrMBO.Rmd’ using ‘UTF-8’ ... OK
 NONE
* checking re-building of vignette outputs ... WARNING
Error(s) in re-building vignettes:
--- re-building ‘mlrMBO.Rmd’ using rmarkdown
--- finished re-building ‘mlrMBO.Rmd’


=================================================================
==21721==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 6656 byte(s) in 26 object(s) allocated from:
    #0 0x7f7605814b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x7f75e4e488ed  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d8ed)

Direct leak of 6400 byte(s) in 10 object(s) allocated from:
    #0 0x7f7605814f40 in realloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdef40)
    #1 0x7f75e4e48998  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d998)

Indirect leak of 19680 byte(s) in 615 object(s) allocated from:
    #0 0x7f7605814b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x7f75e4e36fef  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0xbfef)

Indirect leak of 9299 byte(s) in 778 object(s) allocated from:
    #0 0x7f76057ad538 in strdup (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x77538)
    #1 0x7f75e4e482f4 in FcValueSave (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d2f4)

Indirect leak of 6912 byte(s) in 216 object(s) allocated from:
    #0 0x7f7605814d38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x7f75e4e48fd8  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1dfd8)

Indirect leak of 3840 byte(s) in 120 object(s) allocated from:
    #0 0x7f7605814d38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x7f75e4e485c4  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d5c4)

Indirect leak of 960 byte(s) in 30 object(s) allocated from:
    #0 0x7f7605814d38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x7f75e4e48440  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d440)

Indirect leak of 320 byte(s) in 10 object(s) allocated from:
    #0 0x7f7605814d38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x7f75e4e484b9  (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d4b9)

Indirect leak of 240 byte(s) in 5 object(s) allocated from:
    #0 0x7f7605814b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x7f75e4e42acd in FcLangSetCreate (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x17acd)

SUMMARY: AddressSanitizer: 54307 byte(s) leaked in 1810 allocation(s).

* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 1 WARNING
See
  ‘/home/mdowle/build/revdeplib/mlrMBO.Rcheck/00check.log’
for details.

Thanks for the info.
@mllg Do you have an idea where this could come from?

mllg commented

You first need to narrow it down. Unfortunately, this is very tedious:

Try if you can reproduce this on r-hub. Then you need to selectively enable/disable tests ... I would first try to validate that the C code in mlrMBO is the problem, and not the C code in one of the dependencies (BBmisc, for example).

Had a brief look at mlrMBO source. It uses malloc/free and these are all the free() calls :

./mlrMBO/src$ grep -n -B 2 -A 2 "free(" *.c
avl.c-205-        if(freeitem)
avl.c-206-            freeitem(node->item);
avl.c:207:        free(node);
avl.c-208-    }
avl.c-209-    avl_clear_tree(avltree);
--
avl.c-218-void avl_free_tree(avl_tree_t *avltree) {
avl.c-219-    avl_free_nodes(avltree);
avl.c:220:    free(avltree);
avl.c-221-}
avl.c-222-
--
avl.c-326-        if(avl_insert_node(avltree, newnode))
avl.c-327-            return newnode;
avl.c:328:        free(newnode);
avl.c-329-        errno = EEXIST;
avl.c-330-    }
--
avl.c-399-        if(avltree->freeitem)
avl.c-400-            avltree->freeitem(item);
avl.c:401:        free(avlnode);
avl.c-402-    }
avl.c-403-    return item;
--
hv.c-130-    }
hv.c-131-
hv.c:132:    free(scratch);
hv.c-133-
hv.c-134-    for (i = 1; i <= n; i++)
--
hv.c-142-
hv.c-143-static void free_cdllist(dlnode_t * head) {
hv.c:144:    free(head->tnode); /* Frees _all_ nodes. */
hv.c:145:    free(head->next);
hv.c:146:    free(head->prev);
hv.c:147:    free(head->area);
hv.c:148:    free(head->vol);
hv.c:149:    free(head);
hv.c-150-}
hv.c-151-

Given the error "double free or corruption (!prev)", it's good to set the pointer to NULL after each free() to avoid accidentally using it again. Runnng again through ASAN after this change, maybe it will spot the root problem. But most of these free() calls are inside a function defined by mlrBMO, so you'll need to find the caller and ensure it sets the pointer to NULL after caling the free_* function on it.

There are also 2 references to "leak" in the source. A straightforward leak should be just a leak and not "double free or corruption (!prev)" (iiuc) but these kinds of memory errors tend to be closely related and give you clues.

./mlrMBO/src$ grep -n "leak" *.c
hv.c:357:    /* The memory allocated for the deleted node is lost (leaked)
hv.c:402:        /* Returning here would leak memory.  */