ropensci/stats19

get_stats19 for 1979 to 2004 fails

layik opened this issue · 5 comments

layik commented

Both in the case of specifying a data_dir and not:

get_stats19(year = 1979)

#> ==================================================
#> downloaded 241.9 MB

#> Data saved at /tmp/RtmpRZhhGH/Stats19-Data1979-2004/Vehicles7904.csv/tmp/RtmpRZhhGH/Stats19-Data1979-2004/Road-Accident-Safety-Data-Guide-1979-2004.xls/tmp/RtmpRZhhGH/Stats19-Data1979-2004/Casualty7904.csv/tmp/RtmpRZhhGH/Stats19-Data1979-2004/Accidents7904.csv
#> No files of that type found for that year.
#> This will download 240 MB+ (1.8 GB unzipped).
#> Coordinates and other variables may be unreliable in these datasets.
#> See https://github.com/ropensci/stats19/issues/101 and https://github.com/ropensci/stats19/issues/102
#> Stats19-Data1979-2004
#> Error: Change data_dir, filename, year or run dl_stats19() first.

This is not covered by tests because of the size of the files, I suggest:

  1. Future changes all respect this no test case &
  2. Everyone contributing must be prepared to run tests which would include 1979 to 2004.

Thanks for anyone giving a hand.

layik commented

watch out for #62, #168 and others.

layik commented

Sincere apologies, local version. CRAN version is fine.

I cannot reproduce the issue on my main desktop:

stats19::get_stats19(year = 1979)
#> No files of that type found for that year.
#> �[31mThis will download 240 MB+ (1.8 GB unzipped).�[39m
#> Coordinates and other variables may be unreliable in these datasets.
#> See https://github.com/ropensci/stats19/issues/101 and https://github.com/ropensci/stats19/issues/102
#> Files identified: Stats19-Data1979-2004.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/Stats19-Data1979-2004.zip
#> Attempt downloading from:
#> Data saved at ~/stats19-data/Stats19-Data1979-2004/Vehicles7904.csv~/stats19-data/Stats19-Data1979-2004/Road-Accident-Safety-Data-Guide-1979-2004.xls~/stats19-data/Stats19-Data1979-2004/Casualty7904.csv~/stats19-data/Stats19-Data1979-2004/Accidents7904.csv
#> No files of that type found for that year.
#> �[31mThis will download 240 MB+ (1.8 GB unzipped).�[39m
#> Coordinates and other variables may be unreliable in these datasets.
#> See https://github.com/ropensci/stats19/issues/101 and https://github.com/ropensci/stats19/issues/102
#> Reading in:
#> /home/robin/stats19-data/Stats19-Data1979-2004/Accidents7904.csv
#> date and time columns present, creating formatted datetime column
#> # A tibble: 6,224,198 x 33
#>    accident_index location_eastin… location_northi… longitude latitude
#>    <chr>                     <int>            <int>     <dbl>    <dbl>
#>  1 197901A11AD14                NA               NA        NA       NA
#>  2 197901A1BAW34            198460           894000        NA       NA
#>  3 197901A1BFD77            406380           307000        NA       NA
#>  4 197901A1BGC20            281680           440000        NA       NA
#>  5 197901A1BGF95            153960           795000        NA       NA
#>  6 197901A1CBC96            300370           146000        NA       NA
#>  7 197901A1DAK71            143370           951000        NA       NA
#>  8 197901A1DAP95            471960           845000        NA       NA
#>  9 197901A1EAC32            323880           632000        NA       NA
#> 10 197901A1FBK75            136380           245000        NA       NA
#> # … with 6,224,188 more rows, and 28 more variables: police_force <chr>,
#> #   accident_severity <chr>, number_of_vehicles <int>,
#> #   number_of_casualties <int>, date <date>, day_of_week <chr>, time <chr>,
#> #   local_authority_district <chr>, local_authority_highway <chr>,
#> #   first_road_class <chr>, first_road_number <int>, road_type <chr>,
#> #   speed_limit <int>, junction_detail <chr>, junction_control <chr>,
#> #   second_road_class <chr>, second_road_number <int>,
#> #   pedestrian_crossing_human_control <chr>,
#> #   pedestrian_crossing_physical_facilities <chr>, light_conditions <chr>,
#> #   weather_conditions <chr>, road_surface_conditions <chr>,
#> #   special_conditions_at_site <chr>, carriageway_hazards <chr>,
#> #   urban_or_rural_area <chr>,
#> #   did_police_officer_attend_scene_of_accident <int>,
#> #   lsoa_of_accident_location <chr>, datetime <dttm>

Created on 2020-07-01 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       Ubuntu 18.04.4 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_GB:en                    
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-07-01                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                            
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 3.6.0)                    
#>  backports     1.1.8      2020-06-17 [1] CRAN (R 3.6.3)                    
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 3.6.3)                    
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 3.6.2)                    
#>  crayon        1.3.4      2017-09-16 [2] standard (@1.3.4)                 
#>  desc          1.2.0      2018-05-01 [2] standard (@1.2.0)                 
#>  devtools      2.3.0      2020-04-10 [1] CRAN (R 3.6.3)                    
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.2)                    
#>  ellipsis      0.3.1      2020-05-15 [3] CRAN (R 3.6.3)                    
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 3.6.0)                    
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.2)                    
#>  fs            1.4.2      2020-06-30 [2] CRAN (R 3.6.3)                    
#>  glue          1.4.1      2020-05-13 [2] CRAN (R 3.6.3)                    
#>  highr         0.8        2019-03-20 [3] CRAN (R 3.5.3)                    
#>  hms           0.5.3      2020-01-08 [1] CRAN (R 3.6.2)                    
#>  htmltools     0.5.0.9000 2020-06-18 [1] Github (rstudio/htmltools@a8025f3)
#>  knitr         1.29       2020-06-23 [1] CRAN (R 3.6.3)                    
#>  lifecycle     0.2.0.9000 2020-06-30 [1] Github (r-lib/lifecycle@8e0f87b)  
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 3.5.2)                    
#>  memoise       1.1.0      2017-04-21 [3] CRAN (R 3.5.0)                    
#>  pillar        1.4.4      2020-05-05 [1] CRAN (R 3.6.3)                    
#>  pkgbuild      1.0.8      2020-05-07 [1] CRAN (R 3.6.3)                    
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 3.6.1)                    
#>  pkgload       1.1.0      2020-05-29 [3] CRAN (R 3.6.3)                    
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.2)                    
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 3.6.3)                    
#>  ps            1.3.3      2020-05-08 [1] CRAN (R 3.6.3)                    
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 3.6.1)                    
#>  Rcpp          1.0.4.6    2020-04-09 [1] CRAN (R 3.6.3)                    
#>  readr         1.3.1      2018-12-21 [2] CRAN (R 3.5.3)                    
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 3.6.2)                    
#>  rlang         0.4.6.9000 2020-06-30 [1] Github (r-lib/rlang@64f57b0)      
#>  rmarkdown     2.3        2020-06-18 [1] CRAN (R 3.6.3)                    
#>  rprojroot     1.3-2      2018-01-03 [2] CRAN (R 3.5.3)                    
#>  sessioninfo   1.1.1      2018-11-05 [3] CRAN (R 3.5.1)                    
#>  stats19       1.2.0      2020-06-01 [1] Github (ropensci/stats19@3faf49d) 
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 3.6.2)                    
#>  stringr       1.4.0      2019-02-10 [2] standard (@1.4.0)                 
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 3.6.3)                    
#>  tibble        3.0.1      2020-04-20 [1] CRAN (R 3.6.3)                    
#>  usethis       1.6.1      2020-04-29 [1] CRAN (R 3.6.3)                    
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 3.5.3)                    
#>  vctrs         0.3.1      2020-06-05 [1] CRAN (R 3.6.3)                    
#>  withr         2.2.0      2020-04-20 [2] CRAN (R 3.6.3)                    
#>  xfun          0.15       2020-06-21 [1] CRAN (R 3.6.3)                    
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.2)                    
#> 
#> [1] /home/robin/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

Great, glad it's working. For the record it also works on a fresh docker image:

stats19::get_stats19(year = 1979)
#> No files of that type found for that year.
#> [31mThis will download 240 MB+ (1.8 GB unzipped).[39m
#> Coordinates and other variables may be unreliable in these datasets.
#> See https://github.com/ropensci/stats19/issues/101 and https://github.com/ropensci/stats19/issues/102
#> Files identified: Stats19-Data1979-2004.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/Stats19-Data1979-2004.zip
#> Attempt downloading from:
#> Data saved at /tmp/RtmpKvkxpf/Stats19-Data1979-2004/Vehicles7904.csv/tmp/RtmpKvkxpf/Stats19-Data1979-2004/Road-Accident-Safety-Data-Guide-1979-2004.xls/tmp/RtmpKvkxpf/Stats19-Data1979-2004/Casualty7904.csv/tmp/RtmpKvkxpf/Stats19-Data1979-2004/Accidents7904.csv
#> No files of that type found for that year.
#> [31mThis will download 240 MB+ (1.8 GB unzipped).[39m
#> Coordinates and other variables may be unreliable in these datasets.
#> See https://github.com/ropensci/stats19/issues/101 and https://github.com/ropensci/stats19/issues/102
#> Reading in:
#> /tmp/RtmpKvkxpf/Stats19-Data1979-2004/Accidents7904.csv
#> date and time columns present, creating formatted datetime column
#> # A tibble: 6,224,198 x 33
#>    accident_index location_eastin… location_northi… longitude latitude
#>    <chr>                     <int>            <int>     <dbl>    <dbl>
#>  1 197901A11AD14                NA               NA        NA       NA
#>  2 197901A1BAW34            198460           894000        NA       NA
#>  3 197901A1BFD77            406380           307000        NA       NA
#>  4 197901A1BGC20            281680           440000        NA       NA
#>  5 197901A1BGF95            153960           795000        NA       NA
#>  6 197901A1CBC96            300370           146000        NA       NA
#>  7 197901A1DAK71            143370           951000        NA       NA
#>  8 197901A1DAP95            471960           845000        NA       NA
#>  9 197901A1EAC32            323880           632000        NA       NA
#> 10 197901A1FBK75            136380           245000        NA       NA
#> # … with 6,224,188 more rows, and 28 more variables: police_force <chr>,
#> #   accident_severity <chr>, number_of_vehicles <int>,
#> #   number_of_casualties <int>, date <date>, day_of_week <chr>, time <chr>,
#> #   local_authority_district <chr>, local_authority_highway <chr>,
#> #   first_road_class <chr>, first_road_number <int>, road_type <chr>,
#> #   speed_limit <int>, junction_detail <chr>, junction_control <chr>,
#> #   second_road_class <chr>, second_road_number <int>,
#> #   pedestrian_crossing_human_control <chr>,
#> #   pedestrian_crossing_physical_facilities <chr>, light_conditions <chr>,
#> #   weather_conditions <chr>, road_surface_conditions <chr>,
#> #   special_conditions_at_site <chr>, carriageway_hazards <chr>,
#> #   urban_or_rural_area <chr>,
#> #   did_police_officer_attend_scene_of_accident <int>,
#> #   lsoa_of_accident_location <chr>, datetime <dttm>

Created on 2020-07-01 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> Warning in system("timedatectl", intern = TRUE): running command 'timedatectl'
#> had status 1
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.0 (2020-04-24)
#>  os       Ubuntu 20.04 LTS            
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Etc/UTC                     
#>  date     2020-07-01                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.0)
#>  backports     1.1.8   2020-06-17 [1] CRAN (R 4.0.0)
#>  callr         3.4.3   2020-03-28 [1] CRAN (R 4.0.0)
#>  cli           2.0.2   2020-02-28 [1] CRAN (R 4.0.0)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.0)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.0)
#>  devtools      2.3.0   2020-04-10 [1] CRAN (R 4.0.0)
#>  digest        0.6.25  2020-02-23 [1] CRAN (R 4.0.0)
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.0)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.0)
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.0)
#>  fs            1.4.1   2020-04-04 [1] CRAN (R 4.0.0)
#>  glue          1.4.1   2020-05-13 [1] CRAN (R 4.0.0)
#>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.0)
#>  hms           0.5.3   2020-01-08 [1] CRAN (R 4.0.0)
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.0)
#>  knitr         1.29    2020-06-23 [1] CRAN (R 4.0.0)
#>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.0)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 4.0.0)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.0)
#>  pillar        1.4.4   2020-05-05 [1] CRAN (R 4.0.0)
#>  pkgbuild      1.0.8   2020-05-07 [1] CRAN (R 4.0.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.0)
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.0)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.0)
#>  processx      3.4.2   2020-02-09 [1] CRAN (R 4.0.0)
#>  ps            1.3.3   2020-05-08 [1] CRAN (R 4.0.0)
#>  R6            2.4.1   2019-11-12 [1] CRAN (R 4.0.0)
#>  Rcpp          1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0)
#>  readr         1.3.1   2018-12-21 [1] CRAN (R 4.0.0)
#>  remotes       2.1.1   2020-02-15 [1] CRAN (R 4.0.0)
#>  rlang         0.4.6   2020-05-02 [1] CRAN (R 4.0.0)
#>  rmarkdown     2.3     2020-06-18 [1] CRAN (R 4.0.0)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 4.0.0)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.0)
#>  stats19       1.2.0   2020-03-03 [1] CRAN (R 4.0.0)
#>  stringi       1.4.6   2020-02-17 [1] CRAN (R 4.0.0)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.0)
#>  testthat      2.3.2   2020-03-02 [1] CRAN (R 4.0.0)
#>  tibble        3.0.1   2020-04-20 [1] CRAN (R 4.0.0)
#>  usethis       1.6.1   2020-04-29 [1] CRAN (R 4.0.0)
#>  utf8          1.1.4   2018-05-24 [1] CRAN (R 4.0.0)
#>  vctrs         0.3.1   2020-06-05 [1] CRAN (R 4.0.0)
#>  withr         2.2.0   2020-04-20 [1] CRAN (R 4.0.0)
#>  xfun          0.15    2020-06-21 [1] CRAN (R 4.0.0)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.0)
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library
layik commented

Thanks!