Mode & Vehicle Type values missing when running historical.process()
francescolovat opened this issue · 4 comments
Hi iTEM colleagues,
I'm raising this issue since recent usages of process.historical(0)
(T000) and process.historical(4)
(T004) led to NaN
s in the resulting dataframe columns Mode
& Vehicle Type
(for the former) and Vehicle Type
in the case of the latter.
Values seem to be present in the .csv
files, so this might come from the processing scripts, unless I'm missing something.
It seems like a code issue as you suggested. To verify that, I am attaching the final spreadsheet that Humberto generated from his local codes, which does not have the same issue that you raised.
The spreadsheet I showed here does not have "nan." They are shown as ALL. But I am not sure how to fix the online code, though.
@francescolovat good catch.
The best way to address this will be a consistency check, that gets run on the cleaning script for each of the upstream data sources.
Thanks @francescolovat for opening #72. This will be closed by #71, which adds the consistency checks I mentioned:
The best way to address this will be a consistency check, that gets run on the cleaning script for each of the upstream data sources.
…along with other improvements, namely avoiding the use of easily-confused column names (either via the older ColumnName
enumeration or the newer column_name()
function) and simply using everywhere the consistent IDs of the SDMX dimensions introduced by #62.
This has the advantage of associating the column labels to the SDMX concepts with the same IDs, and those can have long & verbose names and descriptions; longer than would be practical to stick in a column label.
Thank you, @khaeru!