trinker/wakefield

group() limited to 2 groups?

ds4ci opened this issue · 1 comments

ds4ci commented

Thanks for wakefield!

I need to generate factors with more than two levels. group() accepts x with length > 2, but only samples first two. Here is a toy example:

tg <- group(n=100, x=c("a", "b", "c"), name = "test")
summary(tg)
a b c
43 57 0
tg <- group(n=100, x=c("a", "b", "c"), prob = c(0.1, 0.2, 0.7), name = "test")
Error in sample.int(n = 2, size = n, replace = TRUE, prob = prob) :
incorrect number of probabilities
tg <- group(n=100, x=c("a", "b", "c"), prob = c(0.1, 0.2), name = "test")
summary(tg)
a b c
32 68 0

HTH,
Jim

If you look under the hood of that function you see:

group <- hijack(r_sample_binary_factor,
    name = "Group",
    x = c("Control", "Treatment")
)

This is just using the function r_sample_binary_factor with some defaults for cotrol/treatement. That means it's not possible to use group to make more than 2 groups. I can certainly see where you might expect groups to allow n groups but that was not my intent. I will keep the functionality as is because 2 group sampling is more common in my experience when we create groups and making it extended to n groups means a slower function. Most of the little variable generating functions are actually just hijacking a function prefixed r_. So if you can't find a variable function you're looking for go to the r_ prefixed functions, in this case r_sample_factor:

tg <- r_sample_factor(n=100, x=c("a", "b", "c"), prob = c(0.1, 0.2, 0.7), name = "test")

## > tg <- r_sample_factor(n=100, x=c("a", "b", "c"), prob = c(0.1, 0.2, 0.7), name = "test")
## > summary(tg)
##  a  b  c 
##  8 18 74 

I'm going to close this for now. Feel free to reopen if this does not address your concerns. I'll update documentation to be clearer..