nanxstats/r-base-shortcuts

R-Insight: expand.grid Obsolete

Closed this issue · 1 comments

The expand.grid function found in Base R is largely obsolete. A better alternative to use is the vec_expand_grid function found in the vctrs package. It substantially expands on the expand.grid function by executing improved type-set rules:

  • Increased process performance
  • Produces sorted output by default
  • Never converts strings to factors
  • Does not add additional attributes
  • Drops NULL inputs
  • Can expand any vector type, including data frames and records

A more advanced example of a cross-balanced dataset shows three dimensions of data that are organized and connected from within a combinatorial structure of job positions, code provisions, and position categories. Simulated job titles were generated from the charlatan package:

library(charlatan)
library(vctrs)
set.seed(32491)
jb = ch_job(n = 10)
cd = paste0(sample(100:300, size = 3, replace = TRUE), ".", sample(1:8, size = 3, replace = TRUE))
ct = c("TRNG", "ONBRD", "HR")
ds = vec_expand_grid(job = jb, code = cd, cat = ct)

NOTE: When using the vec_expand_grid function, all arguments must be preceded by an argument name whether it is the default x and y parameter or as field names. If argument names are not defined, the function will crash.

Thanks for the suggestion - I won't include this for the same reason mentioned in #3.

Also, I feel the third-party solutions are surely very nice, while saying expand.grid() is "largely obsolete" might not be entirely justified... We should, in general, avoid comparing solutions because that is not the intention here.