strengejacke/sjmisc

Bugs of frq related to sj-labelled character factors and tagged NA's, and minus select helpers

iago-pssjd opened this issue · 5 comments

I have now not much time to dive myself into the depths of sjmisc::frq again,so I write down here 2 issues I found lately and won't be able to solve soon.

  • One is that frq applied to a character factor with sj-labels does not print the labels in the correct order under generic conditions.
  • The second is that frq produces an error when applied to numeric data with sj-labelled tagged_na's.

Thank you!

And another thing to see is in the get_dot_data help functions, since frq does not work with negative (i.e. minus) select helpers to remove variables, like frq(., -starts_with("s"))

Do you have a reproducible example?

Actually, now I cannot imagine how I got to this issues. For the moment, I just quickly looked for an example for the first point:

  • One is that frq applied to a character factor with sj-labels does not print the labels in the correct order under generic conditions.
xf <- factor(x, levels = c("Male", "Man" , "Lady",   "Female"), labels = c("Male", "Male", "Female", "Female"))
xf
[1] Male   Male   Male   Female Female

str(xf)
 Factor w/ 2 levels "Male","Female": 1 1 1 2 2

library(sjlabelled)

attributes(set_labels(xf, labels = c("a", "b")))
$levels
[1] "Male"   "Female"

$class
[1] "factor"

$labels
       a        b 
"Female"   "Male" 

frq(set_labels(xf, labels = c("a", "b")))

x <categorical>
# total N=5  valid N=5  mean=1.40  sd=0.55

Value | Label | N | Raw % | Valid % | Cum. %
--------------------------------------------
    1 |     a | 3 |    60 |      60 |     60
    2 |     b | 2 |    40 |      40 |    100
 <NA> |  <NA> | 0 |     0 |    <NA> |   <NA>

so for example value 1 correspond to level Male and therefore, after sjlabelled::set_label assignation, to sj-label b, but this label is in the frq row for the value 2.

Now I cannot reproduce the second point.

  • The second is that frq produces an error when applied to numeric data with sj-labelled tagged_na's.

I remember I got it with data readed from Stata or similar, but now, with examples built artificially I don't get the issue, so you may skip it.
At least with numeric data, but I can with integer/factor data (I am not sure if this was the motivating error of the issue, but probably....):

library(haven)
x <- labelled(
     as.integer(c(1:3, tagged_na("a", "c", "z"), 4:1)),
     c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
       "Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
   )
frq(x)
Error in if (is.na(mydat$val[valid.vals]) & mydat$val[valid.vals + 1] ==  : 
  missing value where TRUE/FALSE needed