langcog/tidyboot

Performance issues with tidyr ~ 1.0.0

Closed this issue · 1 comments

I'm afraid this is difficult to reproduce without profiling at different versions, but I wanted to give an anecdotal heads-up about performance issues I encountered after updating to the most recent version of R and updating all of my tidyverse packages on a new machine.

On my machine, this code block takes 100ms for nboot=10, 29,270ms for nboot=100, 107,750ms for nboot=200, and so on.

data("diamonds", package = 'ggplot2')
diamonds %>% 
  group_by(color) %>% 
  tidyboot_mean(price, nboot=100)

I profiled the code, and it looks like essentially all of the time is spent in the tidyr::unnest function, specifically the unchop function... I know this took a major performance hit in the tidyr 1.0.0 update, but even after updating to tidyr 1.1.0, where they included a number of performance improvements to unnest, it's still pretty slow.

Screen Shot 2020-05-28 at 7 09 44 PM

Do you think it would possible to work around this in tidyboot? I wondered even if switching back to the unnest_legacy function would help, if the fancier new interface for unnest isn't strictly necessary?

fixed by Robert in 082557d