normalize_cansim_values picks wrong language in tables without VALEUR column
Closed this issue · 3 comments
Hello,
get_cansim("98-10-0002-01", "french")
I notice an issue with the way that normalize_cansim_values
works with French tables. The functions determine the language of the table by looking for a column called VALEUR
, but not all French tables have this column. For example: 98-10-0002-01
ifelse("VALEUR" %in% names(data), "fr", "en")
The code then breaks when attempting to normalize the date column because its subsets ref_date
from the data which returns a NULL
and an unhelpful error:
Error in `vec_init()`:
! `x` must be a vector, not `NULL`.
I would suggest passing the language argument from get_cansim
into normalize_cansim_values
instead of deriving it. Or perhaps also looking for
- "PÉRIODE DE RÉFÉRENCE"
- "GÉO"
- "UNITÉ DE MESURE"
- "IDENTIFICATEUR D'UNITÉ DE MESURE"
- "FACTEUR SCALAIRE"
- "IDENTIFICATEUR SCALAIRE"
- "VECTEUR"
- "COORDONNÉES"
- "TERMINÉ"
- "DÉCIMALES"
Thanks for flagging this. Better language handling, either passing the language directly or adding it as an attribute to the data frame, is overdue. Will prioritize this for the next version.
This case looks like one more in a long list of issues we had to deal with when the census division pushed out their data via the NDM but deviated from the established NDM formats.
I made some changes that should solve this issue. Can you check if everything works as expected after installing the current development version?
remotes::install_github("mountainmath/cansim@v0.3.17")
There are still some issues with reading these tables into sqlite databases, but that's independent of English/French and has to do with the StatCan formatting for Census data differing from the usual NDM. Will try to sort this out at some later time.
Yes, after the update, I was able to access the table successfully.