walkerke/tidycensus

Hispanic variables switched in 2010 Census

walinchus opened this issue · 5 comments

I’m not sure why, but this package has P004002 labeled “Total!! Hispanic or Latino” and has “P004003 labeled Total!!Not Hispanic or Latino” when it should be the other way around according to the variable list in the Census API website.
image

2020 is labeled correctly. Or at least jives with their website.

I'm not seeing what you see when I run load_variables(2010, "sf1"):

image

What specific code did you run, and what is your version of tidycensus?

I am noticing that the variables are swapped in the 2000 SF1. Any chance you accidentally ran load_variables(2000, "sf1")?

1.4.5 (Tidycensus version)

Hmm no it looks like I only used the load_variables() function with 2020 or 2010:
image

Specifically, I ran this code:
decennial_2010_vars <- load_variables( year = 2010, "pl", cache = TRUE)

I removed it from my environment and ran it again and still seem to be getting the same error:

image

Weird that I keep getting this. Thanks for your help.

Noting this code you used:

decennial_2010_vars <- load_variables( year = 2010, dataset = "pl", cache = TRUE)

That means you are pulling from the PL 94-171 redistricting file, not Summary File 1 as you wanted. You'd want instead:

decennial_2010_vars <- load_variables( year = 2010, dataset = "sf1", cache = TRUE)

And yes, it is annoying that Census does not keep variable names consistent across the different datasets. To mitigate this, use the sumfile argument (e.g. sumfile = "sf1") and the variables pulled from a corresponding call to load_variables() for that dataset and year. This inconsistency between the PL and DHC files for 2020 is present as well, as a heads up.

Great thanks for your help.