mountainMath/cansim

Table 36-10-0434-03 retrieval error

MaiaPelletier opened this issue · 5 comments

I've run into an issue where passing table number "36-10-0434-03" to get_cansim() returns a different table from the same statistical program. See the parsed table info:

> cansim::get_cansim_table_info("36-10-0434-03")$`Cube Title`
[1] "Gross domestic product (GDP) at basic prices, by industry, monthly"

The expected cube title for this CODR table number is "Gross domestic product (GDP) at basic prices, by industry, annual average".

Thanks for flagging, this is a bug. Will prioritize this for fixing.

Ok, looks like this problem is deeper than this package. The sub-tables (ending in something else than "-01") aren't available as full table download, and also don't have "vectors" corresponding to them. Trying to "download entire table", which is what the {cansim} package relies on, will download the "base" table (ending in "-01").

I will check with StatCan to see if there are alternative ways to download the table data. For now I would suggest to work with the base table and average over the years. E.g. the following code reproduces what's in table 36-10-0434-03 on the webpage. The first part averages over each year, the second part slices and formats the data like on the webpage.

d<-get_cansim("36-10-0434") |>
  mutate(Year=strftime(Date,"%Y")) |>
  group_by(GeoUID,GEO,Year,`Seasonal adjustment`,Prices,`North American Industry Classification System (NAICS)`) |>
  summarize(val_norm=mean(val_norm),
            VALUE=mean(VALUE),
            .groups="drop")

d |> filter(Prices=="Chained (2012) dollars",
            GEO=="Canada",
            `Seasonal adjustment`=="Seasonally adjusted at annual rates") |>
  select(Year,`North American Industry Classification System (NAICS)`,VALUE) |>
  filter(Year %in% seq(2017,2021)) |>
  mutate(VALUE=as.integer(VALUE)) |>
  pivot_wider(names_from = Year,values_from = VALUE)

Oh that's interesting! Thanks for investigating. The solution you proposed works just fine for me for now, thank you :)

For now I added a warning message in the upcoming v0.3.12 release that warns users of the problem. Still waiting on hearing back from StatCan and will keep this issue open until it is clear if this can be done with {cansim} or if it needs work by StatCan.

Got a reply from StatCan:

The “download entire table” option seems to apply only to the base table (the monthly data table).

To download the annual data table, please use this customization: https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=3610043403&selectedNodeIds=2D1,3D1&checkedLevels=0D1,3D1,3D2,3D3,3D4,3D5,3D6&refPeriods=19970101,20210101&dimensionLayouts=&vectorDisplay=false
Then click the download options button, and then select the first option ”download as displayed”.
Otherwise, data manipulation of the monthly data is the other way as you mentioned in your request.

I hope this will help.

Downloading data "as displayed" does not retrieve the whole sub-table, and is also incompatible with programmatic reproducible workflows. I am interpreting this answer as downloading entire subtables as currently not being supported by StatCan and closing this issue.