Dataset candidate: Dutch OD data

Question

Dataset candidate: Dutch OD data

Opened this issue 4 years ago · 3 comments

See https://github.com/r-tmap/tmap-data/blob/master/Rmd/Dutch_commuting.Rmd

Let me know what you think.

I'm not convinced myself. If we would use OD data, it could be better to use UK data, since there is far less cross-border commuting traffic, and we can connect these data to other open data, such as stats19. @Robinlovelace can surely assist us with that:-) On the other hand, a smaller country like Ireland would also have its benefits.

Answer 1 · 2020-09-09T08:15:04.000Z

Yes, there is good open data for the UK that can be downloaded and processed as follows (using the WIP od package, comments welcome):

# remotes::install_github("itsleeds/od")
# remotes::install_cran("pct")
library(pct)
library(od)
centroids = pct::get_centroids_ew()
od_data = pct::get_od()
desire_lines = od::od_to_sf(x = od_data, z = centroids)
nrow(desire_lines)
names(desire_lines)
local_authority = "Leeds"
desire_lines_local = desire_lines[desire_lines$la_1 == local_authority, ]
lwd = sqrt(desire_lines_local$all) / 10
plot(desire_lines_local$geometry, lwd = lwd)

Resulting in...

# remotes::install_github("itsleeds/od")
# remotes::install_cran("pct")
library(pct)
library(od)
centroids = pct::get_centroids_ew()
#> Parsed with column specification:
#> cols(
#>   MSOA11CD = col_character(),
#>   MSOA11NM = col_character(),
#>   BNGEAST = col_double(),
#>   BNGNORTH = col_double(),
#>   LONGITUDE = col_double(),
#>   LATITUDE = col_double()
#> )
od_data = pct::get_od()
#> No region provided. Returning national OD data.
#> Parsed with column specification:
#> cols(
#>   `Area of residence` = col_character(),
#>   `Area of workplace` = col_character(),
#>   `All categories: Method of travel to work` = col_double(),
#>   `Work mainly at or from home` = col_double(),
#>   `Underground, metro, light rail, tram` = col_double(),
#>   Train = col_double(),
#>   `Bus, minibus or coach` = col_double(),
#>   Taxi = col_double(),
#>   `Motorcycle, scooter or moped` = col_double(),
#>   `Driving a car or van` = col_double(),
#>   `Passenger in a car or van` = col_double(),
#>   Bicycle = col_double(),
#>   `On foot` = col_double(),
#>   `Other method of travel to work` = col_double()
#> )
#> Parsed with column specification:
#> cols(
#>   MSOA11CD = col_character(),
#>   MSOA11NM = col_character(),
#>   BNGEAST = col_double(),
#>   BNGNORTH = col_double(),
#>   LONGITUDE = col_double(),
#>   LATITUDE = col_double()
#> )
desire_lines = od::od_to_sf(x = od_data, z = centroids)
#> 0 origins with no match in zone ids
#> 36312 destinations with no match in zone ids
#>  points not in od data removed.
nrow(desire_lines)
#> [1] 2365889
names(desire_lines)
#>  [1] "geo_code1"     "geo_code2"     "all"           "from_home"    
#>  [5] "light_rail"    "train"         "bus"           "taxi"         
#>  [9] "motorbike"     "car_driver"    "car_passenger" "bicycle"      
#> [13] "foot"          "other"         "geo_name1"     "geo_name2"    
#> [17] "la_1"          "la_2"          "geometry"
local_authority = "Leeds"
desire_lines_local = desire_lines[desire_lines$la_1 == local_authority, ]
lwd = sqrt(desire_lines_local$all) / 10
plot(desire_lines_local$geometry, lwd = lwd)

^{Created on 2020-09-09 by the reprex package (v0.3.0)}

Answer 2 · 2020-09-11T10:42:47.000Z

@mtennekes my overall thinking is when somebody is not convinced themselves then it is good to look for alternatives;)

In this case, I checked you code, and I like the Dutch data - we can, for example focus on some region while showing OD data.

On the other hand, connecting the OD data with stats19 could be interesting, however, I have zero experience with that...

Answer 3 · 2020-09-11T15:18:34.000Z

On the other hand, connecting the OD data with stats19 could be interesting, however, I have zero experience with that...

Andrea Gilardi (@agila5) is doing a PhD on that topic. He has lots of reproducible code that I can point you towards if that would be useful...