import inside py_capture_output breaks delay_load
gdurif opened this issue · 6 comments
I am using reticulate::import()
with delay_load=TRUE
to load a Python package in the .onLoad()
function of my R package (so that user can setup their Python session, e.g. using a virtual environment, afterward).
Said Python package is printing to stdout when being imported, which I want to avoid because it is good practice that R package verbosity on loading could be disabled by the user (for quiet loading).
So I tried to wrap my reticulate::import()
call inside a reticulate::py_capture_output()
call, however the delay loading is not working anymore. Below is a minimum working example:
SciPy
is not available, direct import fails as expected:
library(reticulate)
scipy <- import("scipy")
## Error in py_module_import(module, convert = convert) :
## ModuleNotFoundError: No module named 'scipy'
## Run `reticulate::py_last_error()` for details.
SciPy
is not available yet but import with delay loading works as expected (e.g. I may callreticulate::use_virtualenv()
to setup a Python virtual environment whereSciPy
is installed afterward):
library(reticulate)
scipy <- import("scipy", delay_load=TRUE)
- Import with delay loading inside a
py_capture_output()
call does not work:
library(reticulate)
msg <- py_capture_output({scipy <- import("scipy", delay_load=TRUE)})
## Error in py_module_import(module, convert = convert) :
## ModuleNotFoundError: No module named 'scipy'
## Run `reticulate::py_last_error()` for details.
Note: this behavior is maybe expected since reticulate
may need to initiate the Python session to call py_capture_output()
. In that case, how could I avoid verbosity printing to stdout when importing a Python package during my R package loading ?
Thanks
Edit1: this is a minimum working example, SciPy
import produces no output that I would need to capture. My R package is importing another Python package that produces output at import, namely pykeops
.
Edit2: this is my R session info:
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux
Matrix products: default
BLAS: /usr/lib/libblas.so.3.12.0
LAPACK: /usr/lib/liblapack.so.3.12.0
locale:
[1] LC_CTYPE=en_IE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_IE.UTF-8 LC_COLLATE=en_IE.UTF-8
[5] LC_MONETARY=en_IE.UTF-8 LC_MESSAGES=en_IE.UTF-8
[7] LC_PAPER=en_IE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Paris
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reticulate_1.35.0
loaded via a namespace (and not attached):
[1] compiler_4.3.2 Matrix_1.6-1.1 cli_3.6.2 Rcpp_1.0.12 grid_4.3.2
[6] jsonlite_1.8.8 rlang_1.1.3 png_0.1-8 lattice_0.21-9
What is the output you are trying to suppress? (import("scipy")
produces no output for me).
@t-kalinowski This is a minimum working example, I am not importing SciPy
in my package but another Python package. Sorry it was not clear, I edited my issue to specify it.
We don't currently have an official API for this. Thinking though how this could be implemented, an R package could probably do something like this (completely untested code):
.onLoad <- function(...) {
if (reticulate::py_available())
reticulate::py_capture_output({
module <<- reticulate::import("module")
})
else {
py_output_context_manager <- NULL
module <<- import("module", delay_load = list(
before_load = function() {
reticulate::py_available(TRUE) # initialize python
# get output tools helper functions
output_tools <- import("rpytools.output")
py_output_context_manager <<-
output_tools$OutputCaptureContext(capture_stdout = TRUE,
capture_stderr = TRUE)
py_output_context_manager$`__enter__`()
}
on_load = function() {
py_output_context_manager$`__exit__`()
py_output_context_manager <<- NULL
},
on_error = function(error) {
py_output_context_manager$`__exit__`()
py_colected_output <- py_output_context_manager$collect_output()
py_output_context_manager <<- NULL
error$message <- paste0(c(error$message, py_colected_output), sep = "\n")
stop(error)
}
))
}
}
However, taking a look at 'pykeops' specifically, seems like this is all that's needed to suppress output:
Sys.setenv("PYKEOPS_VERBOSE" = "0")
Thanks a lot @t-kalinowski! I'll work on that.
Regarding the environment variable setup, I want to catch the output from Python and print it with packageStartupMessage()
in R (so that it can be disabled or not using library(xxx, quietly = TRUE/FALSE)
), and not just totally disable it.
Worth mentioning if you're going to be setting env variables: if Python is initialized already, changes to environment variables made with Sys.setenv()
don't automatically propagate to the Python session. (#1339)
So you'll want to do something like this:
Sys.setenv("PYKEOPS_VERBOSE" = "0")
if(py_available())
import("os")$environ$update(list("PYKEOPS_VERBOSE" = "0"))
I might be mistaken, but I don't think library(xxx, quietly = TRUE/FALSE)
suppresses output from packageStartupMessage()
, users would need suppressPackageStartupMessages()
for that.
With import(delay_load = TRUE)
, Python will not initialize during your packages .onLoad()
call, meaning, you won't be able to capture output from python initializing and pass it packageStartupMessage()
before .onLoad()
had returned. I believe conventionally packageStartupMessage()
is only called from .onLoad()
. I don't know if that's just convention or if CRAN enforces it, but either way, I'd recommend a different approach.
I just pushed a small updated to OutputCaptureContext()
, making it a little more ergonomic to use. (#1569)
@t-kalinowski thank you very much again.
You are right regarding suppressPackageStartupMessages()
and I did not think about delay import that will not work with packageStartupMessage()
. I will set the environment variable as you suggested.