ropensci/stats19

Casualities - Acc_Index error for non English locale

Closed this issue · 9 comments

layik commented

Issue:
Accident Indices are empty when running

stats19::get_stats19()

on a Windows machine with Chinese locale.

@Robinlovelace is working on it!

This is a work-around. Run this at the beginning of the session (source):

Sys.setlocale("LC_ALL","English")
layik commented

I will see if it’s ‘grep’ causing this

layik commented

that

Sys.setlocale("LC_ALL","English")

is for Windows only? From

#> ?Sys.setlocale
# ....
Sys.setlocale("LC_TIME", "German") # Windows
# ....

is Windows

Wasn't this fixed in #83 @layik ?

layik commented

I think this is slightly different, but because we have not tested on non Windows machines, we have not been able to add a commit @Robinlovelace. Correct?

Yes, I recall not wanting to Sys.setlocale("LC_TIME", "English") during the session. It would sufficient to do it in the function call (readr::read() I believe). Looking at this vignette https://cran.r-project.org/web/packages/readr/vignettes/locales.html I guess adding a locale = argument should be enough...

Update: see reprex below. I think we can just set it with locale = readr::locale("en"), testing this on a Windows machine should sort it. On a different note, any ideas what's causing all those warnings in the reprex below?? Suspect it's an encoding issue this time...

stats19::dl_stats19(year = 2017, type = "ca")
#> Files identified: dftRoadSafetyData_Casualties_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Casualties_2017.zip
#> Attempt downloading from:
#> Data saved at /tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv
f = file.path(tempdir(), "dftRoadSafetyData_Casualties_2017/Cas.csv")
c = readr::read_csv(f, locale = readr::locale("zh"))
#> Parsed with column specification:
#> cols(
#>   Accident_Index = col_double(),
#>   Vehicle_Reference = col_double(),
#>   Casualty_Reference = col_double(),
#>   Casualty_Class = col_double(),
#>   Sex_of_Casualty = col_double(),
#>   Age_of_Casualty = col_double(),
#>   Age_Band_of_Casualty = col_double(),
#>   Casualty_Severity = col_double(),
#>   Pedestrian_Location = col_double(),
#>   Pedestrian_Movement = col_double(),
#>   Car_Passenger = col_double(),
#>   Bus_or_Coach_Passenger = col_double(),
#>   Pedestrian_Road_Maintenance_Worker = col_double(),
#>   Casualty_Type = col_double(),
#>   Casualty_Home_Area_Type = col_double(),
#>   Casualty_IMD_Decile = col_double()
#> )
#> Warning: 28286 parsing failures.
#>   row            col               expected  acatual                                                        file
#> 32166 Accident_Index no trailing characters T158779 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32167 Accident_Index no trailing characters T163737 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32168 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32169 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32170 Accident_Index no trailing characters T172074 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> ..... .............. ...................... ....... ...........................................................
#> See problems(...) for more details.
summary(c$Accident_Index)
#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
#> 2.017e+12 2.017e+12 2.017e+12       Inf 2.017e+12       Inf     28286
c_en = readr::read_csv(f, locale = readr::locale("en"))
#> Parsed with column specification:
#> cols(
#>   Accident_Index = col_double(),
#>   Vehicle_Reference = col_double(),
#>   Casualty_Reference = col_double(),
#>   Casualty_Class = col_double(),
#>   Sex_of_Casualty = col_double(),
#>   Age_of_Casualty = col_double(),
#>   Age_Band_of_Casualty = col_double(),
#>   Casualty_Severity = col_double(),
#>   Pedestrian_Location = col_double(),
#>   Pedestrian_Movement = col_double(),
#>   Car_Passenger = col_double(),
#>   Bus_or_Coach_Passenger = col_double(),
#>   Pedestrian_Road_Maintenance_Worker = col_double(),
#>   Casualty_Type = col_double(),
#>   Casualty_Home_Area_Type = col_double(),
#>   Casualty_IMD_Decile = col_double()
#> )
#> Warning: 28286 parsing failures.
#>   row            col               expected  actual                                                        file
#> 32166 Accident_Index no trailing characters T158779 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32167 Accident_Index no trailing characters T163737 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32168 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32169 Accident_Index no trailing characters T171585 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> 32170 Accident_Index no trailing characters T172074 '/tmp/RtmpnfcycV/dftRoadSafetyData_Casualties_2017/Cas.csv'
#> ..... .............. ...................... ....... ...........................................................
#> See problems(...) for more details.
summary(c_en$Accident_Index)
#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
#> 2.017e+12 2.017e+12 2.017e+12       Inf 2.017e+12       Inf     28286

Created on 2019-07-28 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.0 (2019-04-26)
#>  os       Debian GNU/Linux 9 (stretch)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Etc/UTC                     
#>  date     2019-07-28                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.0)
#>  backports     1.1.4   2019-04-10 [1] CRAN (R 3.6.0)
#>  callr         3.3.0   2019-07-04 [1] CRAN (R 3.6.0)
#>  cli           1.1.0   2019-03-19 [1] CRAN (R 3.6.0)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.0)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.6.0)
#>  devtools      2.1.0   2019-07-06 [1] CRAN (R 3.6.0)
#>  digest        0.6.20  2019-07-04 [1] CRAN (R 3.6.0)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 3.6.0)
#>  fs            1.3.1   2019-05-06 [1] CRAN (R 3.6.0)
#>  glue          1.3.1   2019-03-12 [1] CRAN (R 3.6.0)
#>  highr         0.8     2019-03-20 [1] CRAN (R 3.6.0)
#>  hms           0.4.2   2018-03-10 [1] CRAN (R 3.6.0)
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.6.0)
#>  knitr         1.23    2019-05-18 [1] CRAN (R 3.6.0)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.6.0)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.6.0)
#>  pillar        1.4.2   2019-06-29 [1] CRAN (R 3.6.0)
#>  pkgbuild      1.0.3   2019-03-20 [1] CRAN (R 3.6.0)
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.6.0)
#>  pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.6.0)
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.6.0)
#>  processx      3.4.0   2019-07-03 [1] CRAN (R 3.6.0)
#>  ps            1.3.0   2018-12-21 [1] CRAN (R 3.6.0)
#>  R6            2.4.0   2019-02-14 [1] CRAN (R 3.6.0)
#>  Rcpp          1.0.1   2019-03-17 [1] CRAN (R 3.6.0)
#>  readr         1.3.1   2018-12-21 [1] CRAN (R 3.6.0)
#>  remotes       2.1.0   2019-06-24 [1] CRAN (R 3.6.0)
#>  rlang         0.4.0   2019-06-25 [1] CRAN (R 3.6.0)
#>  rmarkdown     1.13    2019-05-22 [1] CRAN (R 3.6.0)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.6.0)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.0)
#>  stats19       1.0.0   2019-07-27 [1] local         
#>  stringi       1.4.3   2019-03-12 [1] CRAN (R 3.6.0)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 3.6.0)
#>  testthat      2.1.1   2019-04-23 [1] CRAN (R 3.6.0)
#>  tibble        2.1.3   2019-06-06 [1] CRAN (R 3.6.0)
#>  usethis       1.5.1   2019-07-04 [1] CRAN (R 3.6.0)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.6.0)
#>  xfun          0.8     2019-06-25 [1] CRAN (R 3.6.0)
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.6.0)
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library
layik commented

Lets call it a day until further trouble.

👍 thanks @layik and great 🏓 !