ropensci-books/drake

Explain how concatenation of drake subtargets works in the manual

kendonB opened this issue · 2 comments

We currently just have "To understand how drake splits and concatenates dynamic targets, see functions vec_size(), vec_slice(), and vec_c()."

My understanding was that the dynamic targets behave exactly as if they were concatenated using vec_c. However, this is only true when using loadd - it is not true when using map.

I came across a case where I wanted to further split a dynamic target (a list of length M) into a larger number of elements (a list of length M*N). When I use that dynamic target in a subsequent target in map, I find that it behaves as if it was length M, whereas when I readd it it is length M*N.

Maybe this is a bug. Happy to open in the other github if so

My understanding was that the dynamic targets behave exactly as if they were concatenated using vec_c. However, this is only true when using loadd - it is not true when using map.

You also get vec_c() behavior when you use an entire dynamic target without any transform as in z below.

library(dplyr)
library(drake)
plan <- drake_plan(
  x = head(mtcars),
  y = target(
    x[, c("mpg", "cyl")],
    dynamic = map(x)
  ),
  z = colMeans(y) # y is dynamic, but we do not use a transform on it.
)

make(plan)
#> ▶ target x
#> ▶ dynamic y
#> > subtarget y_25909f33
#> > subtarget y_7fb9021c
#> > subtarget y_1ef01c48
#> > subtarget y_cc68c33a
#> > subtarget y_3005dd12
#> > subtarget y_a853fffb
#> ■ finalize y
#> ▶ target z

readd(z)
#>  mpg  cyl 
#> 20.5  6.0

colMeans(head(mtcars[, c("mpg", "cyl")]))
#>  mpg  cyl 
#> 20.5  6.0

Created on 2020-03-05 by the reprex package (v0.3.0)

I came across a case where I wanted to further split a dynamic target (a list of length M) into a larger number of elements (a list of length MN). When I use that dynamic target in a subsequent target in map, I find that it behaves as if it was length M, whereas when I readd it it is length MN.

This behavior is expected. map() takes M single-element slices, but readd() uses vec_c() to bind the sub-targets together. It is not ideal, but it is the price we pay for fully embracing vctrs.

str(vctrs::vec_c(list(1, 2), list(3, 4)))
#> List of 4
#>  $ : num 1
#>  $ : num 2
#>  $ : num 3
#>  $ : num 4

Created on 2020-03-05 by the reprex package (v0.3.0)

To counter it, wrap each sub-target in its own list. This should always be possible.

str(vctrs::vec_c(list(list(1, 2)), list(list(3, 4))))
#> List of 2
#>  $ :List of 2
#>   ..$ : num 1
#>   ..$ : num 2
#>  $ :List of 2
#>   ..$ : num 3
#>   ..$ : num 4

Created on 2020-03-05 by the reprex package (v0.3.0)

NB readd(your_target, subtarget_list = TRUE) suppresses vec_c() behavior.