tidymodels/recipes

The `bake.step_log()` S3 method breaks legacy recipe objects

stufield opened this issue ยท 5 comments

The problem

The current (CRAN v1.0.9) versions of bake.step_log() breaks older (legacy) recipes made prior to (guessing) v1.0.4, > 1 year ago. When attempting to bake() a prepped recipe that has a log_step, the bake() S3 method used to check for the appropriate columns using the x$columns entry of the object, however it now looks for its names, i.e. names(x$columns), which for older recipes without vector names will be NULL, and breaks backward compatibility of the object.

The line where this happens is: https://github.com/tidymodels/recipes/blob/main/R/log.R#L127

And it appears to have been introduced here: 6155183

Perhaps even worse, the bake() step does not fail when names are NULL, it skips the logging-loop and exits without modifying the any columns, and resulting in incorrect pre-processed data.

Questions:

  1. are there any future plans to address backward compatibility of older objects?
  2. is there a recommended (preferred) workaround?
  3. should breaking changes of legacy recipes be a consideration in semantic versioning?
  4. should I modify the recipe object as necessary for it to function in v1.0.9?

The last option is not ideal since in my use case, these recipes (~45) are production level objects that are frozen with their corresponding model objects and are tied to regulatory controlled customer products.

Reprex

I do not have a reprex per se, but there is a simple workaround:

update_legacy_logstep <- function(obj) {
  idx <- which(vapply(obj$steps, inherits, what = "step_log", NA))
  for (i in seq_len(idx)) {
    step <- obj$steps[[i]]
    if (is.null(names(step$columns))) {
      names(obj$steps[[i]]$columns) <- step$columns
    }
  }
  obj
}

Hello!

This is indeed a bug. And should be fixed. It is kinda hard testing recipes steps for backwards compatibility so sometimes these things fall through the cracks. This is a high priority issue

Would you be able to freeze the installed {recipes} versions? Or were the recent update necessary? Your temporary fix would work as well

I'm out of office without computer, coming back on 15th.

Hi @EmilHvitfeldt ,

In our production environment we intentionally run a few versions behind "live CRAN" and do an annual update, which is when we caught this issue.

Waiting for a bug-fix likely won't work since we cannot use a development version (for reproducibility we must use a released CRAN version). So I'll use the workaround-fix adding the temporary names to the object prior to baking, which should allow backward compatibility internally with our code base too.

I definitely know how things can slip through the cracks ๐Ÿคท๐Ÿฝโ€โ™‚๏ธ
I more or less just wanted to know if this was on your radar and if there was a known fix I should be looking into.

Thanks! ๐Ÿ™Œ๐Ÿฝ

@stufield A new version is on CRAN with your change. Thanks!

Thank you @topepo !

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.