Translated tables convert numeric to character if codebook is missing
Closed this issue · 1 comments
deepayan commented
Example:
> phonto::nhanesQuery("select SEQN, BPXSAR, BPXDAR from Raw.BPX_C order by SEQN") |> str()
'data.frame': 9643 obs. of 3 variables:
$ SEQN : int 21005 21006 21007 21008 21009 21010 21011 21012 21013 21014 ...
$ BPXSAR: num NA 98 96 104 118 136 NA 121 108 NA ...
$ BPXDAR: num NA 50 62 74 85 83 NA 65 67 NA ...
> phonto::nhanesQuery("select SEQN, BPXSAR, BPXDAR from Translated.BPX_C order by SEQN") |> str()
'data.frame': 9643 obs. of 3 variables:
$ SEQN : int 21005 21006 21007 21008 21009 21010 21011 21012 21013 21014 ...
$ BPXSAR: chr NA "98" "96" "104" ...
$ BPXDAR: chr NA "50" "62" "74" ...
These two variables are missing from the codebook. We have normally no way of knowing whether the variable is numeric (in this case, we can check other cycles), but it's probably better to keep such variables numeric by default.
Other examples (not exhaustive) are HPVSWR_F, OHXPRL_B, OHXPRU_B
This came up as one source of mismatch in the R vs DB translations.
nathan-palmer commented
Changing translation process to leverage NHANESA.