r-spatialecology/landscapemetrics

Performance of `rcpp_get_unique_values()`

Closed this issue · 6 comments

I tried two computers while working on #87. The first one is a laptop with i5 and the second one is a desktop with a much better i7. The performance of rcpp_get_unique_values(), however, is about 10 times better on a laptop. Can you think why is that? @marcosci @mhesselbarth @bitbacchus

laptop with i5

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)))
#> # A tibble: 1 x 10
#>   expression    min   mean median   max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 4.18ms 4.54ms 4.57ms 5.6ms      220.    4.72MB     3   104
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-19 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                          
#>  version  R version 3.5.3 (2019-03-11)   
#>  os       Fedora 29 (Workstation Edition)
#>  system   x86_64, linux-gnu              
#>  ui       X11                            
#>  language (EN)                           
#>  collate  en_US.UTF-8                    
#>  ctype    en_US.UTF-8                    
#>  tz       Europe/Warsaw                  
#>  date     2019-04-19                     
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package          * version date       lib
#>  assertthat         0.2.0   2017-04-11 [1]
#>  backports          1.1.3   2018-12-14 [1]
#>  bench              1.0.1   2018-06-06 [1]
#>  callr              3.1.1   2018-12-21 [1]
#>  cli                1.0.1   2018-09-25 [1]
#>  codetools          0.2-16  2018-12-24 [2]
#>  crayon             1.3.4   2017-09-16 [1]
#>  desc               1.2.0   2018-05-01 [1]
#>  devtools           2.0.1   2018-10-26 [1]
#>  digest             0.6.18  2018-10-10 [1]
#>  evaluate           0.13    2019-02-12 [1]
#>  fansi              0.4.0   2018-10-05 [1]
#>  fs                 1.2.6   2018-08-23 [1]
#>  glue               1.3.0   2018-07-17 [1]
#>  highr              0.7     2018-06-09 [1]
#>  htmltools          0.3.6   2017-04-28 [1]
#>  knitr              1.22    2019-03-08 [1]
#>  landscapemetrics * 1.1     2019-04-19 [1]
#>  lattice            0.20-38 2018-11-04 [2]
#>  magrittr           1.5     2014-11-22 [1]
#>  memoise            1.1.0   2017-04-21 [1]
#>  pillar             1.3.1   2018-12-15 [1]
#>  pkgbuild           1.0.2   2018-10-16 [1]
#>  pkgconfig          2.0.2   2018-08-16 [1]
#>  pkgload            1.0.2   2018-10-29 [1]
#>  prettyunits        1.0.2   2015-07-13 [1]
#>  processx           3.3.0   2019-03-10 [1]
#>  profmem            0.5.0   2018-01-30 [1]
#>  ps                 1.3.0   2018-12-21 [1]
#>  R6                 2.4.0   2019-02-14 [1]
#>  raster           * 2.9-1   2019-03-11 [1]
#>  Rcpp               1.0.0   2018-11-07 [1]
#>  remotes            2.0.2   2018-10-30 [1]
#>  rlang              0.3.1   2019-01-08 [1]
#>  rmarkdown          1.11    2018-12-08 [1]
#>  rprojroot          1.3-2   2018-01-03 [1]
#>  sessioninfo        1.1.1   2018-11-05 [1]
#>  sp               * 1.3-1   2018-06-05 [1]
#>  stringi            1.3.1   2019-02-13 [1]
#>  stringr            1.4.0   2019-02-10 [1]
#>  testthat           2.0.1   2018-10-13 [1]
#>  tibble             2.0.1   2019-01-12 [1]
#>  usethis            1.4.0   2018-08-14 [1]
#>  utf8               1.1.4   2018-05-24 [1]
#>  withr              2.1.2   2018-03-15 [1]
#>  xfun               0.5.2   2019-03-11 [1]
#>  yaml               2.2.0   2018-07-25 [1]
#>  source                          
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  local                           
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  Github (rspatial/raster@faf518e)
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  CRAN (R 3.5.1)                  
#>  Github (yihui/xfun@d882a87)     
#>  CRAN (R 3.5.1)                  
#> 
#> [1] /home/jn/R/x86_64-redhat-linux-gnu-library/3.5
#> [2] /usr/lib64/R/library
#> [3] /usr/share/R/library

desktop with i7

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)))
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 37.3ms 37.9ms 37.7ms 39.2ms      26.4    4.72MB     1    13
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-19 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                          
#>  version  R version 3.5.3 (2019-03-11)   
#>  os       Fedora 29 (Workstation Edition)
#>  system   x86_64, linux-gnu              
#>  ui       X11                            
#>  language (EN)                           
#>  collate  en_US.UTF-8                    
#>  ctype    en_US.UTF-8                    
#>  tz       Europe/Warsaw                  
#>  date     2019-04-19                     
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package          * version date       lib
#>  assertthat         0.2.1   2019-03-21 [1]
#>  backports          1.1.3   2018-12-14 [1]
#>  bench              1.0.1   2018-06-06 [1]
#>  callr              3.2.0   2019-03-15 [1]
#>  cli                1.1.0   2019-03-19 [1]
#>  codetools          0.2-16  2018-12-24 [2]
#>  crayon             1.3.4   2017-09-16 [1]
#>  desc               1.2.0   2018-05-01 [1]
#>  devtools           2.0.1   2018-10-26 [1]
#>  digest             0.6.18  2018-10-10 [1]
#>  evaluate           0.13    2019-02-12 [1]
#>  fansi              0.4.0   2018-10-05 [1]
#>  fs                 1.2.7   2019-03-19 [1]
#>  glue               1.3.1   2019-03-12 [1]
#>  highr              0.8     2019-03-20 [1]
#>  htmltools          0.3.6   2017-04-28 [1]
#>  knitr              1.22    2019-03-08 [1]
#>  landscapemetrics * 1.1     2019-04-19 [1]
#>  lattice            0.20-38 2018-11-04 [2]
#>  magrittr           1.5     2014-11-22 [1]
#>  memoise            1.1.0   2017-04-21 [1]
#>  pillar             1.3.1   2018-12-15 [1]
#>  pkgbuild           1.0.3   2019-03-20 [1]
#>  pkgconfig          2.0.2   2018-08-16 [1]
#>  pkgload            1.0.2   2018-10-29 [1]
#>  prettyunits        1.0.2   2015-07-13 [1]
#>  processx           3.3.0   2019-03-10 [1]
#>  profmem            0.5.0   2018-01-30 [1]
#>  ps                 1.3.0   2018-12-21 [1]
#>  R6                 2.4.0   2019-02-14 [1]
#>  raster           * 2.9-2   2019-04-12 [1]
#>  Rcpp               1.0.1   2019-03-17 [1]
#>  remotes            2.0.4   2019-04-10 [1]
#>  rlang              0.3.4   2019-04-07 [1]
#>  rmarkdown          1.12    2019-03-14 [1]
#>  rprojroot          1.3-2   2018-01-03 [1]
#>  sessioninfo        1.1.1   2018-11-05 [1]
#>  sp               * 1.3-1   2018-06-05 [1]
#>  stringi            1.4.3   2019-03-12 [1]
#>  stringr            1.4.0   2019-02-10 [1]
#>  testthat           2.0.1   2018-10-13 [1]
#>  tibble             2.1.1   2019-03-16 [1]
#>  usethis            1.4.0   2018-08-14 [1]
#>  utf8               1.1.4   2018-05-24 [1]
#>  withr              2.1.2   2018-03-15 [1]
#>  xfun               0.5     2019-02-20 [1]
#>  yaml               2.2.0   2018-07-25 [1]
#>  source                          
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  local                           
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  Github (rspatial/raster@81060dc)
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.3)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#>  CRAN (R 3.5.2)                  
#> 
#> [1] /home/jn/R/x86_64-redhat-linux-gnu-library/3.5
#> [2] /usr/lib64/R/library
#> [3] /usr/share/R/library

Hm, this sounds weird. Can you try again with manually setting the number of iterations:

bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)), iterations = 10000)

# A tibble: 1 x 14
  expression                 min    mean  median     max `itr/sec` mem_alloc  n_gc n_itr total_time result memory   time  gc      
  <chr>                  <bch:t> <bch:t> <bch:t> <bch:t>     <dbl> <bch:byt> <dbl> <int>   <bch:tm> <list> <list>   <lis> <list>  
1 landscapemetrics:::rc2.63ms  2.97ms  2.97ms  4.31ms      337.    1.14MB   385  9615      28.6s <int<Rprofm<bch<tibble

Are the versions of gcc the same? Do both PC have SSD disks?

Do you know what's the default number of iterations there Jakub? Could that be a source for this (maybe it was just too few iterations)?

laptop with i5

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 4.14ms 4.79ms 4.63ms 6.63ms      209.    4.72MB    36   964
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-24 by the reprex package (v0.2.1)

desktop with i7

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 36.6ms 37.6ms 37.3ms 41.2ms      26.6    4.72MB    36   964
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-24 by the reprex package (v0.2.1)

gcc on both computers: gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2). Also - both computer have SSD disks.

It is not a huge problem for me, however It would be nice to find the reason for this difference...

But is it actually related to rcpp_get_unique_values?
Is

bench::mark(raster::unique(as.matrix(augusta_nlcd)), iterations = 1000)

also faster on the i5 laptop?

laptop with i5

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 4.16ms 4.95ms  4.8ms 8.32ms      202.    4.72MB    36   964
#> # … with 1 more variable: total_time <bch:tm>
bench::mark(raster::unique(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min  mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 raster::u… 3.97ms 5.1ms  4.7ms 10.6ms      196.    5.79MB   232   768
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-24 by the reprex package (v0.2.1)

desktop with i7

library(landscapemetrics)
library(raster)
#> Loading required package: sp
bench::mark(landscapemetrics:::rcpp_get_unique_values(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 landscape… 36.9ms 37.5ms 37.4ms 39.9ms      26.7    4.72MB    36   964
#> # … with 1 more variable: total_time <bch:tm>
bench::mark(raster::unique(as.matrix(augusta_nlcd)), iterations = 1000)
#> # A tibble: 1 x 10
#>   expression    min   mean median    max `itr/sec` mem_alloc  n_gc n_itr
#>   <chr>      <bch:> <bch:> <bch:> <bch:>     <dbl> <bch:byt> <dbl> <int>
#> 1 raster::u… 2.92ms 3.02ms    3ms 4.04ms      331.    5.79MB   237   763
#> # … with 1 more variable: total_time <bch:tm>

Created on 2019-04-24 by the reprex package (v0.2.1)

I will close this for now. Please re-open if needed.