ropensci/stats19

Error with multiple files download and sf/ppp format

Closed this issue · 3 comments

Today I tried the following:

remotes::install_github("ropensci/stats19")
#> Skipping install of 'stats19' from a github remote, the SHA1 (6163bc29) has not changed since last install.
#>   Use `force = TRUE` to force installation

library(stats19)
#> Data provided under OGL v3.0. Cite the source and link to:
#> www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
res <- get_stats19(c(2017, 2018), silent = TRUE)
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Attempt downloading from:
#> Data saved at C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Attempt downloading from:
#> Data saved at C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
res <- get_stats19(c(2017, 2018), silent = TRUE, output_format = "sf")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> 19 rows removed with no coordinates
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Data already exists in data_dir, not downloading
#> Data saved at C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
#> 55 rows removed with no coordinates
#> Error: arguments have different crs
res <- get_stats19(c(2017, 2018), silent = TRUE, output_format = "ppp")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> C:\Users\Utente\AppData\Local\Temp\RtmpE1zPip/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> 19 rows removed with no coordinates
#> Warning: some mark values are NA in the point pattern x
#> Error in rbind(deparse.level, ...): invalid list argument: all variables should have the same length

Created on 2020-01-30 by the reprex package (v0.3.0)

The problem is somewhere here:

stats19/R/get.R

Lines 89 to 101 in 6163bc2

if(is.vector(year) && length(year) > 1) {
all = data.frame()
for (aYear in year) {
all = rbind(all,get_stats19(year = aYear,
type = type,
data_dir = data_dir,
file_name = file_name,
format = format,
ask = ask,
output_format = output_format, ...))
}
return(all)
}

The first problem is clear and I simply forgot to add silent parameter to that part of the get_stats19 function, sorry 😅. The other problems are more difficult and I'm not sure about the best approach. The easiest solution would be to just check if length(year) > 1 & output_format != "tibble" and then stop the function raising an error message. The more elegant approach would be to modify that for loop according to the output_format specification. What do you think?

Actually the solution for sf objects is trivial (and I've already pushed it in the multiple_output_format branch). I'm not sure about ppp objects but I asked that on SO.

Aha, yes I didn't think about how #136 could related to #99... Well spotted, great you're on it. Many thanks!

Heads-up @layik and @wengraf this is now fixed, as shown in reprex below. Great work @agila5 !

remotes::install_github("ropensci/stats19")
#> Skipping install of 'stats19' from a github remote, the SHA1 (f9982ed8) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(stats19)
#> Data provided under OGL v3.0. Cite the source and link to:
#> www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
a = get_stats19(2017:2018, "ac")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> /home/robin/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
adf = get_stats19(2017:2018, "ac", output_format = "data.frame")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> /home/robin/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
asf = get_stats19(2017:2018, "ac", output_format = "sf")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> /home/robin/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> 19 rows removed with no coordinates
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
#> 55 rows removed with no coordinates
appp = get_stats19(2017:2018, "ac", output_format = "ppp")
#> Files identified: dftRoadSafetyData_Accidents_2017.zip
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2017.zip
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> Reading in:
#> /home/robin/stats19-data/dftRoadSafetyData_Accidents_2017/Acc.csv
#> date and time columns present, creating formatted datetime column
#> 19 rows removed with no coordinates
#> Warning: some mark values are NA in the point pattern x
#> Files identified: dftRoadSafetyData_Accidents_2018.csv
#>    http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/dftRoadSafetyData_Accidents_2018.csv
#> Data already exists in data_dir, not downloading
#> Data saved at ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> Reading in:
#> ~/stats19-data/dftRoadSafetyData_Accidents_2018.csv
#> date and time columns present, creating formatted datetime column
#> 55 rows removed with no coordinates
#> Warning: some mark values are NA in the point pattern x

#> Warning: some mark values are NA in the point pattern x

class(a)
#> [1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame"
class(adf)
#> [1] "data.frame"
class(asf)
#> [1] "sf"         "tbl_df"     "tbl"        "data.frame"
class(appp)
#> [1] "ppp"

head(a)
#> Registered S3 method overwritten by 'cli':
#>   method     from    
#>   print.boxx spatstat
#> # A tibble: 6 x 33
#>   accident_index location_eastin… location_northi… longitude latitude
#>   <chr>                     <int>            <int>     <dbl>    <dbl>
#> 1 2017010001708            532920           196330   -0.0801     51.7
#> 2 2017010009342            526790           181970   -0.174      51.5
#> 3 2017010009344            535200           181260   -0.0530     51.5
#> 4 2017010009348            534340           193560   -0.0607     51.6
#> 5 2017010009350            533680           187820   -0.0724     51.6
#> 6 2017010009351            514510           172370   -0.354      51.4
#> # … with 28 more variables: police_force <chr>, accident_severity <chr>,
#> #   number_of_vehicles <int>, number_of_casualties <int>, date <date>,
#> #   day_of_week <chr>, time <chr>, local_authority_district <chr>,
#> #   local_authority_highway <chr>, first_road_class <chr>,
#> #   first_road_number <int>, road_type <chr>, speed_limit <int>,
#> #   junction_detail <chr>, junction_control <chr>, second_road_class <chr>,
#> #   second_road_number <int>, pedestrian_crossing_human_control <chr>,
#> #   pedestrian_crossing_physical_facilities <chr>, light_conditions <chr>,
#> #   weather_conditions <chr>, road_surface_conditions <chr>,
#> #   special_conditions_at_site <chr>, carriageway_hazards <chr>,
#> #   urban_or_rural_area <chr>,
#> #   did_police_officer_attend_scene_of_accident <int>,
#> #   lsoa_of_accident_location <chr>, datetime <dttm>
head(adf)
#>   accident_index location_easting_osgr location_northing_osgr longitude
#> 1  2017010001708                532920                 196330 -0.080107
#> 2  2017010009342                526790                 181970 -0.173845
#> 3  2017010009344                535200                 181260 -0.052969
#> 4  2017010009348                534340                 193560 -0.060658
#> 5  2017010009350                533680                 187820 -0.072372
#> 6  2017010009351                514510                 172370 -0.353876
#>   latitude        police_force accident_severity number_of_vehicles
#> 1 51.65006 Metropolitan Police             Fatal                  2
#> 2 51.52242 Metropolitan Police            Slight                  2
#> 3 51.51410 Metropolitan Police            Slight                  3
#> 4 51.62483 Metropolitan Police            Slight                  2
#> 5 51.57341 Metropolitan Police           Serious                  1
#> 6 51.43876 Metropolitan Police            Slight                  2
#>   number_of_casualties       date day_of_week  time local_authority_district
#> 1                    3 2017-08-05    Saturday 03:12                  Enfield
#> 2                    1 2017-01-01      Sunday 01:30              Westminster
#> 3                    1 2017-01-01      Sunday 00:30            Tower Hamlets
#> 4                    1 2017-01-01      Sunday 01:11                  Enfield
#> 5                    1 2017-01-01      Sunday 01:42                  Hackney
#> 6                    1 2017-01-01      Sunday 03:31     Richmond upon Thames
#>   local_authority_highway first_road_class first_road_number          road_type
#> 1                 Enfield                A               105 Single carriageway
#> 2             Westminster                A                 5 Single carriageway
#> 3           Tower Hamlets                A                13 Single carriageway
#> 4                 Enfield                A              1010         Roundabout
#> 5                 Hackney                A               107   Dual carriageway
#> 6    Richmond upon Thames     Unclassified                 0 Single carriageway
#>   speed_limit                     junction_detail             junction_control
#> 1          30 Not at junction or within 20 metres Data missing or out of range
#> 2          30             T or staggered junction     Give way or uncontrolled
#> 3          30             T or staggered junction     Give way or uncontrolled
#> 4          30                          Roundabout     Give way or uncontrolled
#> 5          20                          Crossroads          Auto traffic signal
#> 6          30 Not at junction or within 20 metres Data missing or out of range
#>   second_road_class second_road_number pedestrian_crossing_human_control
#> 1              <NA>                  0             None within 50 metres
#> 2      Unclassified                  0             None within 50 metres
#> 3                 C                  0             None within 50 metres
#> 4                 B                154             None within 50 metres
#> 5                 A                 10             None within 50 metres
#> 6              <NA>                  0             None within 50 metres
#>                                     pedestrian_crossing_physical_facilities
#> 1                          No physical crossing facilities within 50 metres
#> 2                          No physical crossing facilities within 50 metres
#> 3                          No physical crossing facilities within 50 metres
#> 4 Pelican, puffin, toucan or similar non-junction pedestrian light crossing
#> 5                               Pedestrian phase at traffic signal junction
#> 6                          No physical crossing facilities within 50 metres
#>        light_conditions    weather_conditions road_surface_conditions
#> 1 Darkness - lights lit    Fine no high winds                     Dry
#> 2 Darkness - lights lit    Fine no high winds             Wet or damp
#> 3 Darkness - lights lit    Fine no high winds                     Dry
#> 4 Darkness - lights lit Raining no high winds             Wet or damp
#> 5 Darkness - lights lit    Fine no high winds             Wet or damp
#> 6 Darkness - lights lit    Fine no high winds             Wet or damp
#>   special_conditions_at_site carriageway_hazards urban_or_rural_area
#> 1                       None                None               Urban
#> 2                       None                None               Urban
#> 3                       None                None               Urban
#> 4                       None                None               Urban
#> 5                       None                None               Urban
#> 6                       None                None               Urban
#>   did_police_officer_attend_scene_of_accident lsoa_of_accident_location
#> 1                                           1                 E01001450
#> 2                                           1                 E01004702
#> 3                                           1                 E01004298
#> 4                                           1                 E01001429
#> 5                                           1                 E01001808
#> 6                                           1                 E01003900
#>              datetime
#> 1 2017-08-05 03:12:00
#> 2 2017-01-01 01:30:00
#> 3 2017-01-01 00:30:00
#> 4 2017-01-01 01:11:00
#> 5 2017-01-01 01:42:00
#> 6 2017-01-01 03:31:00
head(asf)
#> Simple feature collection with 6 features and 31 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 514510 ymin: 172370 xmax: 535200 ymax: 196330
#> epsg (SRID):    27700
#> proj4string:    +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +units=m +no_defs
#> # A tibble: 6 x 32
#>   accident_index longitude latitude police_force accident_severi…
#>   <chr>              <dbl>    <dbl> <chr>        <chr>           
#> 1 2017010001708    -0.0801     51.7 Metropolita… Fatal           
#> 2 2017010009342    -0.174      51.5 Metropolita… Slight          
#> 3 2017010009344    -0.0530     51.5 Metropolita… Slight          
#> 4 2017010009348    -0.0607     51.6 Metropolita… Slight          
#> 5 2017010009350    -0.0724     51.6 Metropolita… Serious         
#> 6 2017010009351    -0.354      51.4 Metropolita… Slight          
#> # … with 27 more variables: number_of_vehicles <int>,
#> #   number_of_casualties <int>, date <date>, day_of_week <chr>, time <chr>,
#> #   local_authority_district <chr>, local_authority_highway <chr>,
#> #   first_road_class <chr>, first_road_number <int>, road_type <chr>,
#> #   speed_limit <int>, junction_detail <chr>, junction_control <chr>,
#> #   second_road_class <chr>, second_road_number <int>,
#> #   pedestrian_crossing_human_control <chr>,
#> #   pedestrian_crossing_physical_facilities <chr>, light_conditions <chr>,
#> #   weather_conditions <chr>, road_surface_conditions <chr>,
#> #   special_conditions_at_site <chr>, carriageway_hazards <chr>,
#> #   urban_or_rural_area <chr>,
#> #   did_police_officer_attend_scene_of_accident <int>,
#> #   lsoa_of_accident_location <chr>, datetime <dttm>, geometry <POINT [m]>
head(appp)
#> Warning: some mark values are NA in the point pattern x
#> Marked planar point pattern: 6 points
#> Mark variables: 
#>    accident_index longitude latitude police_force accident_severity 
#> number_of_vehicles number_of_casualties date day_of_week time 
#> local_authority_district local_authority_highway first_road_class 
#> first_road_number road_type speed_limit junction_detail junction_control 
#> second_road_class second_road_number pedestrian_crossing_human_control 
#> pedestrian_crossing_physical_facilities light_conditions weather_conditions 
#> road_surface_conditions special_conditions_at_site carriageway_hazards 
#> urban_or_rural_area did_police_officer_attend_scene_of_accident 
#> lsoa_of_accident_location datetime
#> window: rectangle = [64950, 655391] x [10235, 1209512] units

Created on 2020-02-03 by the reprex package (v0.3.0)