ropensci/targets

targets' use of warnings is redundant, prone to false positives and negatives, and risks (causing other packages to) violating CRAN policy in normal use

cjvanlissa opened this issue · 2 comments

I'm working on integrating targets with my worcs package, and notice the following gratuitous warning when _targets.R contains the code:

tar_target(
    name = df,
    command = worcs::load_data(to_envir = FALSE)$embeddings
  )

Output:

Warning message:
Do not use load_data() from {devtools} or {pkgload} to load packages or custom functions/globals for `targets`. If you do, custom functions will go to a package environment where `targets` may not track them, and the loaded data will not be available in parallel workers created by tar_make_clustermq() or tar_make_future(). Read https://books.ropensci.org/targets/packages.html#loading-and-configuring-r-packages and https://books.ropensci.org/targets/packages.html#package-based-invalidation for the correct way to load packages for `targets` pipelines. Warnings like this one are important, but if you must suppress them,  you can do so with Sys.setenv(TAR_WARN = "false").

I believe that this warning is poorly implemented; going purely by function name when throwing this warning is prone to false positives, as you see here when a user calls a different function by the same name. Moreover, it is also prone to false negatives if the user calls the offending functions outside of _targets.R. Moreover, this puts me in a position where I risk violating CRAN's policy when I engage in normal use of the targets package: "In principle, packages must pass R CMD check without warnings or significant notes to be admitted to the main CRAN package area. ".

I also do not believe that a warning is appropriate in this context, going by established canon about the use of warnings in packages; e.g.:

  • "Warnings fall somewhat in between errors and message, and typically indicate that something has gone wrong but the function has been able to at least partially recover." (https://adv-r.hadley.nz/conditions.html) but nothing has gone wrong here.
  • "avoid R’s warning feature. This is particularly important if you use R in production; when you regularly run R scripts as part of your business process. This is also important if you author R packages. Don’t issue warnings in your own code and treat warnings in 3rd party code as errors that are explicitly suppressed." (https://www.r-bloggers.com/2012/05/a-warning-about-warning/)
  • "We have adopted a policy to write R code in such a way, that warnings will not occur in the production environment. Hence, if in production a warning arises, this is always unexpected and should be treated as an error. This is achieved by setting options(warn = 2), which converts all warnings into errors (unexpected errors, to be precise)." (https://energizedanalytics.com/how-to-deal-with-errors-and-warnings-in-productive-r-code/)

Finally, the package proposes a way to suppress this warning, namely Sys.setenv(TAR_WARN = "false").. But there are two problems with this: firstly, this environmental variable does not exist until it is set to "false", so I cannot check its value in my package. Secondly, this environmental variable is redundant with options(warn=-1), thus introducing further unnecessary complexity and confusion.

Given that this package is maintained by ropensci, I hope you will give some thought around your policy for throwing unnecessary warnings in production R-code, and consider reimplementing this one (and potentially others).

Prework

  • [ v ] I understand and agree to help guide.
  • [ v ] I understand and agree to contributing guide.
  • [ v ] New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.

Sorry for the rude/to the point message by the way! If I were less rushed, I would have written a more diplomatic message. I think targets is phenomenal and am a fellow open science enthusiast, and would love to have a productive discussion about this topic, because I do think these details matter. Have a good weekend!

as you see here when a user calls a different function by the same name.

62ef565 uses getNamespace(package)$.__DEVTOOLS__ instead of static code analysis to look for pkgload::load_all(). (Edit: needed to drop that warning entirely due to issues in reverse dependencies.)

I also do not believe that a warning is appropriate in this context,

targets almost never throws warnings. There are a small number of situations where an error is too restrictive but a message is too weak. pkgload::load_all() is one of them.

this environmental variable does not exist until it is set to "false", so I cannot check its value in my package.

You can always check identical(Sys.getenv("TAR_WARN"), "") to see if TAR_WARN is set. Packages can use withr::local_envvar() and withr::with_envvar() to temporarily set environment variables inside function calls without permanently modifying the user's global state.

this environmental variable is redundant with options(warn=-1), thus introducing further unnecessary complexity and confusion.

options(warn=-1) affects all warnings, whereas TAR_WARN applies only to those set by targets. This granularity is useful.