Slow performance with progressr::handler_cli
Opened this issue · 7 comments
I have noticed there is a considerable overhead when using handler_cli
. For example,
f <- function(R = 1000) {
pb <- progressr::progressor(steps = R)
onerun <- function(...) {
pb()
Sys.sleep(0.001)
}
system.time(lapply(seq_len(R), onerun))
}
progressr::handlers(progressr::handler_progress)
f() |> print()
## user system elapsed
## 1.594 0.023 2.879
compared to
progressr::handlers(progressr::handler_cli)
f() |> print()
## user system elapsed
## 9.058 0.021 10.352
Is this something that could be fixed, or perhaps this need to be taken up with the cli
maintainer?
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin22.6.0 (64-bit)
Running under: macOS Sonoma 14.3.1
Matrix products: default
BLAS: /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRblas.dylib
LAPACK: /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Copenhagen
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.3.2 R6_2.5.1 cli_3.6.2 hms_1.1.3
[5] prettyunits_1.2.0 crayon_1.5.2 vctrs_0.6.5 progress_1.2.3
[9] progressr_0.14.0 digest_0.6.34 lifecycle_1.0.4 pkgconfig_2.0.3
[13] rlang_1.1.3
Very nice example code, and clear post! I'm not a developer, but thought I'd add a data point in case it's helpful. I do not observe a difference between the two.
progressr::handlers(progressr::handler_progress)
f() |> print()
user system elapsed
0.384 0.036 1.588
> progressr::handlers(progressr::handler_cli)
f() |> print()
user system elapsed
0.381 0.053 1.600
> sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 23.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] colorout_1.2-2
loaded via a namespace (and not attached):
[1] compiler_4.2.2 cli_3.6.1 parallel_4.2.2 progressr_0.14.0 git2r_0.32.0
[6] digest_0.6.33 rlang_1.1.1
>
I also tried a different system, which has very different hardware but similar OS (one is Ubuntu 23.04, the other 23.10), and similarly do not see a difference:
> sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 23.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] colorout_1.2-2
loaded via a namespace (and not attached):
[1] compiler_4.2.2 cli_3.6.1 parallel_4.2.2 progressr_0.14.0 git2r_0.32.0 digest_0.6.33 rlang_1.1.1
>
Do you have another system you can test your code on?
I can reproduce the slowness of the cli handler (R 4.3.3 on Linux running R
from the terminal);
## Baseline
> progressr::handlers(progressr::handler_void())
> f()
user system elapsed
0.482 0.026 1.622
## 'progress' package
> progressr::handlers(progressr::handler_progress)
> f()
user system elapsed
0.879 0.024 1.994
## 'cli' package
> progressr::handlers(progressr::handler_cli)
> f()
user system elapsed
5.174 0.005 6.283
Running
profvis::profvis(f())
gives:
which reveals that lots of the overhead comes from time spent in cli:::clii__container_start()
and cli:::clii__container_end()
.
So, yes, it looks like a cli issue. That said, before reporting to them, I would come up with an example that does not involve progressr. Comparing the performance of a vanilla cli progress bar to one from the progress package would probably be a good case to report on.
@kkholst I wonder if it is a regression from cli package from 3.6.1 to 3.6.2. If you make a minimal example without future package and want me to test it, I can test it on 3.6.1 and hopefully fail to reproduce the problem, and then I can upgrade to 3.6.2 and test it there to reproduce it.
Also, it looks like the performance gets worse in the RStudio Console (12s vs 6s). I'm testing with:
> packageVersion("cli")
[1] '3.6.2'
Here is what I get when comparing directly against progress
and cli
progressr::handlers(global=TRUE)
f <- function(R = 500,
type = c("progressr", "progress", "cli")) {
cli::cli_alert_info(type)
type <- gsub("\\.(.*)", "", type)
pb <- switch(type[1],
progress = progress::progress_bar$new(total = R),
cli = cli::cli_progress_bar(total = R),
progressr = progressr::progressor(steps = R),
NULL
)
for (i in seq_len(R)) {
switch(type[1],
progress = pb$tick(),
cli = cli::cli_progress_update(),
progressr = pb(),
NULL
)
Sys.sleep(0.005)
}
}
f(type = "none") |>
system.time() |>
print()
progressr::handlers(progressr::handler_void)
f(type = "progressr.void") |>
system.time() |>
print()
progressr::handlers(progressr::handler_progress)
f(type = "progressr.progress") |>
system.time() |>
print()
progressr::handlers(progressr::handler_cli)
f(type = "progressr.cli") |>
system.time() |>
print()
f(type = "progress") |>
system.time() |>
print()
f(type = "cli") |>
system.time() |>
print()
ℹ none
user system elapsed
0.020 0.005 3.105
ℹ progressr.void
user system elapsed
0.499 0.012 3.600
ℹ progressr.progress
user system elapsed
1.473 0.022 4.639
ℹ progressr.cli
user system elapsed
5.046 0.012 8.240
ℹ progress
user system elapsed
0.606 0.015 3.718
ℹ cli
user system elapsed
0.450 0.005 3.541
Note, that progressr
with the progress
handler also seems to run 25% slower.
Thanks for this. Interesting, this suggests that the extra overhead comes from progressr interacting with cli.
Note that there will always be another layer of overhead when using progressr. All progress updates are signalled as condition in R, which handled by R's condition handlers. That layer in progressr is orchestrating these conditions, and translates them to commands specific to each progress handler package.
Looking at your stats, subtracting the "none" baseline, we have:
dt <- c(none=3.105, progressr.void=3.600, progressr.progress=4.639, progressr.cli=8.240, progress=3.718, cli=3.541)
dt2 <- dt - dt[["none"]]
print(dt2)
#> none progressr.void
#> 0.000 0.495
#> progressr.progress progressr.cli
#> 1.534 5.135
#> progress cli
#> 0.613 0.436
Then, the added overhead per progress update (in milliseconds) is:
R <- 500
ms <- 1000*dt2/R
print(ms)
#> none progressr.void
#> 0.000 0.990
#> progressr.progress progressr.cli
#> 3.068 10.270
#> progress cli
#> 1.226 0.872
The handlers("cli")
stands out by costing ~10ms per progress update, when the bare-bones baseline is ~1ms per progress update.
I'll add it to the list to see if there's some obvious that stands out and that can be improved.
As a workaround, see #81 for how to limit how frequently you send progress updates.
I'm sure you're already aware of this, but signalling lots of progress updates for short-running tasks is less of a value compared to long-running tasks. So, in practice, I expect 10ms vs 1ms per progress update makes little difference to the overall processing time in those cases. That said, if it can be improved, why not fix it. Thus far, I haven't profiled or optimized anything, so there's probably something that can be improved.
Another, much simpler, workaround is to use the intrusiveness
argument that each progress handler has. Using:
progressr::handlers(progressr::handler_cli(intrusiveness = 10.0))
will cause the handler to render progress 10 times less frequently. So, instead of reporting on all R = 500
progress update, it only reports on every 10:th, i.e. in total 50 of them. That will bring your benchmark results from:
ℹ none
user system elapsed
0.043 0.006 2.606
ℹ progressr.void
user system elapsed
0.325 0.017 2.910
ℹ progressr.progress
user system elapsed
1.027 0.028 3.605
ℹ progressr.cli
user system elapsed
2.743 0.018 5.309
ℹ progress
user system elapsed
0.414 0.010 2.974
ℹ cli
user system elapsed
0.264 0.011 2.829
down to:
ℹ progressr.cli
user system elapsed
1.659 0.020 4.241