Progress bar misleading when ICE = TRUE

Question

Progress bar misleading when ICE = TRUE

notiv opened this issue 6 years ago · 6 comments

When using partial with ICE = FALSE and progress = 'time', the progress bar seems to produce a reasonable estimate of the time to completion. When ICE = TRUE though, the estimate is far off and the function runs for quite some time after the progress bar has reached 100%. The number of features and the number of records are relatively high (> 700, > 300K respectively). Is there an explanation for that?

Answer 1 · 2018-11-28T15:59:17.000Z

@notiv I can't imagine why that would happen, but it may be an error with either the progress or plyr packages. Do you have a reproducible example I can run on my end?

Answer 2 · 2018-11-28T16:12:00.000Z

I checked the source code and I didn't find a reason why that would happen either. I also tried to create a reprex, but there was no problem with small datasets. Do you expect different run times depending on the value of ICE? I thought that one should first calculate the ICE and then get the average, i.e. running times should be comparable.

I'll check further and I'll also try to create a reproducible example with a larger dataset.

Answer 3 · 2018-11-28T17:27:39.000Z

@notiv Correct, theoretically, ICE curves should be faster since they are computed first. However, you'll notice they take longer in pdp because they get post-processed (e.g., converted from wide to long format; initially, each ICE curve is in a different row) to make them easier to plot.

Answer 4 · 2021-07-18T21:04:08.000Z

I'm having the same issue as with @notiv. Progress bar is accurate for the PDP but not for the ICE plot. The dataset I'm dealing with is big (6 million data points) but plotting PDP only takes a few minutes, whereas ICE takes forever, even after progress bar reaches 100%. I assume it's the wide-to-long format conversion that's taking so long.

Answer 5 · 2021-07-18T23:23:55.000Z

I can try reimplementing the long-to-wide conversion, or even trying to eliminate it whenever ice=TRUE. My task is removing the plyr dependency, so I’ll take a hard look at this soon.

Answer 6 · 2021-08-04T14:35:17.000Z

Fix available on this branch if anyone cares to test: https://github.com/bgreenwell/pdp/tree/foreach/R.