tidyverse/purrr

FR: Default value for `map_xxx()` when return is of lenght 0

DanChaltiel opened this issue · 1 comments

Hi,

When you use map_xxx(), for instance map_chr(), the function .f should return a value of length exactly 1.

While I agree that a value of length >1 is obviously the sign of an error, I think a value of length 0 is probably more the sign of a missing value.

Therefore, maybe we could have a .default argument, giving the value that should be returned if .f returns a value of length 0.

I am guessing the default should be .default="error" for backward compatibility, although I don't think anyone is relying on such an error to build their workflow, so .default=NA should do also.

I couldn't find a more thoughtful example than the following, but please trust me that this happens rather frequently in my experience, for instance when dealing with lookups.
Here is a reprex:

library(tidyverse)
x = tibble(
  df=c("iris", "mtcars"), 
  data=list(iris, mtcars), 
  nm=map(data, names)
)
x %>% mutate(id=map(nm, ~.x[nchar(.x)==7]))
#> # A tibble: 2 x 4
#>   df     data           nm         id       
#>   <chr>  <list>         <list>     <list>   
#> 1 iris   <df [150 x 5]> <chr [5]>  <chr [1]>
#> 2 mtcars <df [32 x 11]> <chr [11]> <chr [0]>
x %>% mutate(id=map_chr(nm, ~.x[nchar(.x)==7]))
#> Error in `mutate()`:
#> i In argument: `id = map_chr(nm, ~.x[nchar(.x) == 7])`.
#> Caused by error in `map_chr()`:
#> i In index: 2.
#> Caused by error:
#> ! Result must be length 1, not 0.

map_chr2 = function(.x, .f, .default=NA_character_, ...){
  rtn = map(.x, .f, ...)
  rtn[lengths(rtn)==0] = .default
  list_c(rtn)
}
x %>% mutate(id=map_chr2(nm, ~.x[nchar(.x)==7], .default=NA))
#> # A tibble: 2 x 4
#>   df     data           nm         id     
#>   <chr>  <list>         <list>     <chr>  
#> 1 iris   <df [150 x 5]> <chr [5]>  Species
#> 2 mtcars <df [32 x 11]> <chr [11]> <NA>

Created on 2023-11-24 with reprex v2.0.2

This would be a really big departure from a pretty central rule in purrr, so unfortunately it's not something I'd feel happy incorporating into purrr. If you wanted to make this is a bit simpler to use in your own code, I'd suggest creating an adverb:

library(tidyverse)
x = tibble(
  df=c("iris", "mtcars"), 
  data=list(iris, mtcars), 
  nm=map(data, names)
)

with_zero_default <- function(.f, .default = NA) {
  force(.f)
  force(.default)

  function(.x, ...) {
    out <- .f(.x, ...)
    if (length(out) == 0) out <- .default
    out
  }
}
x %>% mutate(id=map_chr(nm, with_zero_default(\(x) x[nchar(x)==7], NA)))
#> # A tibble: 2 × 4
#>   df     data           nm         id     
#>   <chr>  <list>         <list>     <chr>  
#> 1 iris   <df [150 × 5]> <chr [5]>  Species
#> 2 mtcars <df [32 × 11]> <chr [11]> <NA>

Created on 2024-07-15 with reprex v2.1.0