Incorrect group names in output of frq() using grouped data
andersson10 opened this issue · 1 comments
When I use frq()
with a grouped tibble, with grouping done with dplyr::group_by
and the tibble with tibble::tibble,
the printed group names are not always associated with the correct group. The output is not the same as when I use base::data.frame
, for instance.
Consider the following:
library(tibble)
library(dplyr)
library(sjmisc)
df_1 <- tibble(
x = rep(c("b", "a"), each = 2),
y = rep(1:2, each = 2)
)
frq(group_by(df_1, x))
The output from frq(group_by(df_1, x))
above is:
> frq(group_by(df_1, x))
Grouped by:
x: b
# y <integer>
# total N=2 valid N=2 mean=2.00 sd=0.00
val frq raw.prc valid.prc cum.prc
2 2 100 100 100
<NA> 0 0 NA NA
Grouped by:
x: a
# y <integer>
# total N=2 valid N=2 mean=1.00 sd=0.00
val frq raw.prc valid.prc cum.prc
1 2 100 100 100
<NA> 0 0 NA NA
>
This output seems to be incorrect in that the values of the grouping variable have switched places. For instance, the mean of y
for group x: b
should be 1
, not 2
, as displayed.
> mean(df_1$y[df_1$x == "b"])
[1] 1
>
Where it says Grouped by: x: b
it seems it should say Grouped by: x: a
.
We can compare this to a similar operation, in which tibble::tibble
has been replaced by base::data.frame
:
df_2 <- data.frame(
x = rep(c("b", "a"), each = 2),
y = rep(1:2, each = 2)
)
df_2
frq(group_by(df_2, x))
Here, frq(group_by(df_1, x))
generates the follwing output:
> frq(group_by(df_2, x))
Grouped by:
x: a
# y <integer>
# total N=2 valid N=2 mean=2.00 sd=0.00
val frq raw.prc valid.prc cum.prc
2 2 100 100 100
<NA> 0 0 NA NA
Grouped by:
x: b
# y <integer>
# total N=2 valid N=2 mean=1.00 sd=0.00
val frq raw.prc valid.prc cum.prc
1 2 100 100 100
<NA> 0 0 NA NA
>
As we can see, the two groups a
and b
does not display the same values in both outputs.
Version info:
R version 3.5.2 (2018-12-20)
tibble 2.0.1
dplyr 0.8.0.1
sjmisc 2.7.7
Thanks! This is due to reordering when the grouping column (x
in this case) is a character vector, and not a factor. If you use data.frame()
with stringsAsFactors = FALSE
, you get the same error. I fixed this, and will commit later.