targets' use of warnings is redundant, prone to false positives and negatives, and risks (causing other packages to) violating CRAN policy in normal use
cjvanlissa opened this issue · 2 comments
I'm working on integrating targets
with my worcs
package, and notice the following gratuitous warning when _targets.R
contains the code:
tar_target(
name = df,
command = worcs::load_data(to_envir = FALSE)$embeddings
)
Output:
Warning message:
Do not use load_data() from {devtools} or {pkgload} to load packages or custom functions/globals for `targets`. If you do, custom functions will go to a package environment where `targets` may not track them, and the loaded data will not be available in parallel workers created by tar_make_clustermq() or tar_make_future(). Read https://books.ropensci.org/targets/packages.html#loading-and-configuring-r-packages and https://books.ropensci.org/targets/packages.html#package-based-invalidation for the correct way to load packages for `targets` pipelines. Warnings like this one are important, but if you must suppress them, you can do so with Sys.setenv(TAR_WARN = "false").
I believe that this warning is poorly implemented; going purely by function name when throwing this warning is prone to false positives, as you see here when a user calls a different function by the same name. Moreover, it is also prone to false negatives if the user calls the offending functions outside of _targets.R
. Moreover, this puts me in a position where I risk violating CRAN's policy when I engage in normal use of the targets package: "In principle, packages must pass R CMD check without warnings or significant notes to be admitted to the main CRAN package area. ".
I also do not believe that a warning is appropriate in this context, going by established canon about the use of warnings in packages; e.g.:
- "Warnings fall somewhat in between errors and message, and typically indicate that something has gone wrong but the function has been able to at least partially recover." (https://adv-r.hadley.nz/conditions.html) but nothing has gone wrong here.
- "avoid R’s warning feature. This is particularly important if you use R in production; when you regularly run R scripts as part of your business process. This is also important if you author R packages. Don’t issue warnings in your own code and treat warnings in 3rd party code as errors that are explicitly suppressed." (https://www.r-bloggers.com/2012/05/a-warning-about-warning/)
- "We have adopted a policy to write R code in such a way, that warnings will not occur in the production environment. Hence, if in production a warning arises, this is always unexpected and should be treated as an error. This is achieved by setting
options(warn = 2)
, which converts all warnings into errors (unexpected errors, to be precise)." (https://energizedanalytics.com/how-to-deal-with-errors-and-warnings-in-productive-r-code/)
Finally, the package proposes a way to suppress this warning, namely Sys.setenv(TAR_WARN = "false").
. But there are two problems with this: firstly, this environmental variable does not exist until it is set to "false"
, so I cannot check its value in my package. Secondly, this environmental variable is redundant with options(warn=-1)
, thus introducing further unnecessary complexity and confusion.
Given that this package is maintained by ropensci
, I hope you will give some thought around your policy for throwing unnecessary warnings in production R-code, and consider reimplementing this one (and potentially others).
Prework
- [ v ] I understand and agree to help guide.
- [ v ] I understand and agree to contributing guide.
- [ v ] New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.
Sorry for the rude/to the point message by the way! If I were less rushed, I would have written a more diplomatic message. I think targets is phenomenal and am a fellow open science enthusiast, and would love to have a productive discussion about this topic, because I do think these details matter. Have a good weekend!
as you see here when a user calls a different function by the same name.
62ef565 uses getNamespace(package)$.__DEVTOOLS__
instead of static code analysis to look for pkgload::load_all()
. (Edit: needed to drop that warning entirely due to issues in reverse dependencies.)
I also do not believe that a warning is appropriate in this context,
targets
almost never throws warnings. There are a small number of situations where an error is too restrictive but a message is too weak. pkgload::load_all()
is one of them.
this environmental variable does not exist until it is set to "false", so I cannot check its value in my package.
You can always check identical(Sys.getenv("TAR_WARN"), "")
to see if TAR_WARN is set. Packages can use withr::local_envvar()
and withr::with_envvar()
to temporarily set environment variables inside function calls without permanently modifying the user's global state.
this environmental variable is redundant with options(warn=-1), thus introducing further unnecessary complexity and confusion.
options(warn=-1)
affects all warnings, whereas TAR_WARN
applies only to those set by targets
. This granularity is useful.