Casualities - Acc_Index error for non English locale
Closed this issue · 9 comments
Issue:
Accident Indices are empty when running
stats19::get_stats19()
on a Windows machine with Chinese locale.
@Robinlovelace is working on it!
This is a work-around. Run this at the beginning of the session (source):
Sys.setlocale("LC_ALL","English")
I will see if it’s ‘grep’ causing this
that
Sys.setlocale("LC_ALL","English")
is for Windows only? From
#> ?Sys.setlocale
# ....
Sys.setlocale("LC_TIME", "German") # Windows
# ....
is Windows
I think this is slightly different, but because we have not tested on non Windows machines, we have not been able to add a commit @Robinlovelace. Correct?
Yes, I recall not wanting to Sys.setlocale("LC_TIME", "English")
during the session. It would sufficient to do it in the function call (readr::read()
I believe). Looking at this vignette https://cran.r-project.org/web/packages/readr/vignettes/locales.html I guess adding a locale =
argument should be enough...
Update: see reprex below. I think we can just set it with locale = readr::locale("en")
, testing this on a Windows machine should sort it. On a different note, any ideas what's causing all those warnings in the reprex below?? Suspect it's an encoding issue this time...
stats19::dl_stats19(year = 2017, type = "ca")
#> Files identified: dftRoadSafetyData_Casualties_2017.zip
#> http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Casualties_2017.zip
#> Attempt downloading from:
#> Data saved at /tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv
f = file.path(tempdir(), "dftRoadSafetyData_Casualties_2017/Cas.csv")
c = readr::read_csv(f, locale = readr::locale("zh"))
#> Parsed with column specification:
#> cols(
#> Accident_Index = col_double(),
#> Vehicle_Reference = col_double(),
#> Casualty_Reference = col_double(),
#> Casualty_Class = col_double(),
#> Sex_of_Casualty = col_double(),
#> Age_of_Casualty = col_double(),
#> Age_Band_of_Casualty = col_double(),
#> Casualty_Severity = col_double(),
#> Pedestrian_Location = col_double(),
#> Pedestrian_Movement = col_double(),
#> Car_Passenger = col_double(),
#> Bus_or_Coach_Passenger = col_double(),
#> Pedestrian_Road_Maintenance_Worker = col_double(),
#> Casualty_Type = col_double(),
#> Casualty_Home_Area_Type = col_double(),
#> Casualty_IMD_Decile = col_double()
#> )
#> Warning: 28286 parsing failures.
#> row col expected acatual file
#> 32166 Accident_Index no trailing characters T158779 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32167 Accident_Index no trailing characters T163737 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32168 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32169 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32170 Accident_Index no trailing characters T172074 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> ..... .............. ...................... ....... ...........................................................
#> See problems(...) for more details.
summary(c$Accident_Index)
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
#> 2.017e+12 2.017e+12 2.017e+12 Inf 2.017e+12 Inf 28286
c_en = readr::read_csv(f, locale = readr::locale("en"))
#> Parsed with column specification:
#> cols(
#> Accident_Index = col_double(),
#> Vehicle_Reference = col_double(),
#> Casualty_Reference = col_double(),
#> Casualty_Class = col_double(),
#> Sex_of_Casualty = col_double(),
#> Age_of_Casualty = col_double(),
#> Age_Band_of_Casualty = col_double(),
#> Casualty_Severity = col_double(),
#> Pedestrian_Location = col_double(),
#> Pedestrian_Movement = col_double(),
#> Car_Passenger = col_double(),
#> Bus_or_Coach_Passenger = col_double(),
#> Pedestrian_Road_Maintenance_Worker = col_double(),
#> Casualty_Type = col_double(),
#> Casualty_Home_Area_Type = col_double(),
#> Casualty_IMD_Decile = col_double()
#> )
#> Warning: 28286 parsing failures.
#> row col expected actual file
#> 32166 Accident_Index no trailing characters T158779 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32167 Accident_Index no trailing characters T163737 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32168 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32169 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32170 Accident_Index no trailing characters T172074 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> ..... .............. ...................... ....... ...........................................................
#> See problems(...) for more details.
summary(c_en$Accident_Index)
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
#> 2.017e+12 2.017e+12 2.017e+12 Inf 2.017e+12 Inf 28286
Created on 2019-07-28 by the reprex package (v0.3.0)
Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.0 (2019-04-26)
#> os Debian GNU/Linux 9 (stretch)
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Etc/UTC
#> date 2019-07-28
#>
#> ─ Packages ──────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
#> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0)
#> callr 3.3.0 2019-07-04 [1] CRAN (R 3.6.0)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0)
#> devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.0)
#> digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> hms 0.4.2 2018-03-10 [1] CRAN (R 3.6.0)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.0)
#> knitr 1.23 2019-05-18 [1] CRAN (R 3.6.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
#> pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.0)
#> pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.6.0)
#> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0)
#> processx 3.4.0 2019-07-03 [1] CRAN (R 3.6.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0)
#> Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.6.0)
#> readr 1.3.1 2018-12-21 [1] CRAN (R 3.6.0)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
#> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.0)
#> rmarkdown 1.13 2019-05-22 [1] CRAN (R 3.6.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> stats19 1.0.0 2019-07-27 [1] local
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> testthat 2.1.1 2019-04-23 [1] CRAN (R 3.6.0)
#> tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
#> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.0)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
#> xfun 0.8 2019-06-25 [1] CRAN (R 3.6.0)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#>
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library
Lets call it a day until further trouble.
👍 thanks @layik and great 🏓 !