ipeaGIT/r5r

Isochrones in low-road density contexts

Closed this issue · 7 comments

Hello, I was getting the same error mentioned in #346 specifically:

`Error in `dplyr::summarise()`:
ℹ In argument: `polygons = `%>%`(...)`.
Caused by error:
! TypeError: Cannot read properties of undefined (reading '0')`

I discovered that more generally this error appears if your nearest network vertex is not within the travel-time+mode combination specified which can be quite common in more rural areas. For example, if your travel time is relatively low (e.g. 5-minutes) and your only estimating walking. I thought I would flag this as a "bug" (but its more of a feature of how the isochrones are calculated) and perhaps it could be documented somewhere. Also if there are many start points (some that do have multiple vertices in the correct distance) it prevents isochrones being generated for any of them, and its not clear from the error which start point "broke" the function.

It may be outside of the scope of the package, but could there be an option to calculate simple (but probably more computationally expensive) line isochrones (following the road network rather than providing a polygon), these could be perhaps limited to low road/footpath density contexts and considering distances less than 1 km? Maybe another option where below a certain vertex density there is some additional sampling done along the polylines to calculate a bigger distance matrix?

There are also generally more distortions as well in low-road density contexts with isochrones only covering perhaps a small fraction of the only accessible road. Low-road density can be quite common in more isolated rural areas where nodes along country roads maybe much further than a five or 10 minute walk away.

Below is an example area.

Thanks for your efforts!

house <- st_sfc(st_point(c(-2.24415, 57.11189)),crs=4326)
ruralarea <- st_as_sfc(st_bbox(c(xmin = -2.239323, xmax = -2.249966, ymax = 57.114902, ymin = 57.109682), crs = st_crs(4326)))

Hi @richardneilbelcher . Thanks for opening this issue. I really like the idea of allowing users to chose whether the function should return polygon-based isochrones or line-based isochrones.

A thing to consider here, though, is that the current implementation of the function draws isochrone polygons based on the travel times from each origin to all (or a sample of) vertices in the street network. This works fine if we simply want to draw the isochrone polygons. If we want to do line-based isochrones, though, we should probably consider the travel times from each origin to all (or a sample) of the mid-points of the edges in the street network.

Hi @richardneilbelcher. I have now added a new boolean parameter polygon_output to the isochrone() function, so now the user can choose whether the output should be a polygon- or line-based isochrone. Here is the documentation of the new parameter:

#' @param polygon_output A Logical. If `TRUE`, the function outputs
#'        polygon-based isochrones (the default) based on travel times from each
#'        origin to a sample of a random  sample nodes in the transport network
#'        (see parameter `sample_size`). If `FALSE`, the function outputs
#'        line-based isocrhones based on travel times from each origin to the
#'        centroids of all segments in the transport network.

reprex

options(java.parameters = "-Xmx2G")
library(r5r)
library(ggplot2)

# build transport network
data_path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path = data_path)

# load origin/point of interest
points <- read.csv(file.path(data_path, "poa_hexgrid.csv"))
origin_1 <- points[936,]

departure_datetime <- as.POSIXct(
 "13-05-2019 14:00:00",
 format = "%d-%m-%Y %H:%M:%S"
)

# estimate polygon-based isochrone from origin_1
iso_poly <- isochrone(r5r_core,
                 origins = origin_1,
                 mode = "walk",
                 polygon_output = T,
                 departure_datetime = departure_datetime,
                 cutoffs = seq(0, 100, 10)
                 )
head(iso_poly)


# estimate line-based isochrone from origin_1
iso_lines <- isochrone(r5r_core,
                      origins = origin_1,
                      mode = "walk",
                      polygon_output = F,
                      departure_datetime = departure_datetime,
                      cutoffs = seq(0, 100, 10)
)
head(iso_lines)


#### plot
colors <- c('#ffe0a5','#ffcb69','#ffa600','#ff7c43','#f95d6a',
            '#d45087','#a05195','#665191','#2f4b7c','#003f5c')

# polygons
ggplot() +
  geom_sf(data=iso_poly, aes(fill=factor(isochrone))) +
  scale_fill_manual(values = colors) +
  theme_minimal()

# lines
ggplot() +
  geom_sf(data=iso_lines, aes(color=factor(isochrone))) +
  scale_color_manual(values = colors) +
  theme_minimal()
poly line

I've also added an error message in cases where there are too few points (less than 4) to create a proper isochrone polygon.

reprex

to reproduce the error:

# update to test error
origin_1$lat <- -30.10
origin_1$lon <- -51.18


# estimate polygon-based isochrone from origin_1
iso_poly <- isochrone(r5r_core,
                 origins = origin_1,
                 mode = "walk",
                 polygon_output = T,
                 departure_datetime = departure_datetime,
                 cutoffs = seq(0, 5, 1)

Error in FUN(X[[i]], ...) :
Your origin point is probably located in an area where the road density is too low to create proper isochrone polygons. In this case, we strongly recommend setting polygon_output = FALSE

it works with polygon_output = F:

# estimate line-based isochrone from origin_1
iso_lines <- isochrone(r5r_core,
                      origins = origin_1,
                      mode = "walk",
                      polygon_output = F,
                      departure_datetime = departure_datetime,
                      cutoffs = seq(0, 5, 1)
                      )
ggplot() +
  geom_point(data=origin_1, aes(x=lon, y=lat), color='orange') +
  geom_sf(data=iso_lines, aes(color=factor(isochrone))) +
  theme_minimal()
Screenshot 2024-04-25 205357

@richardneilbelcher , I think this should be solved with commit 808db5b. You can test with the reprex above or with your own data using the dev version of r5r. I'm happy to reopen the issue if the problem persists.

Hi @rafapereirabr this is great! Thanks for doing it so quickly too. All the changes you have made seem to work as intended on my system. A few ideas:

I like the idea of finding the centroid of the polyline (to not double up on the nodes). I think this works in most cases, but in some (again e.g. rural areas) the road lines are actually quite long so even though the line centroid is accessible lots of the line may not be (e.g. a few single roads > 1 km). Would it also be possible to have the option of generating additional user defined increases in the number of regularly spaced points along line segments e.g. using st_sample(). This would probably work best in travel-time units to keep it consistent with the cutoff parameter. So e.g. a point every 0.5 minutes along line segments. There could be an IF statement potentially to exclude line segments smaller than this threshold too if that helps make it run faster.

The error message is helpful, could it say on which start_points it breaks on too (I think I am right in thinking it tries to calculate all the isochrones before trying to return them) ? This could help users troubleshoot and give a glimpse on the severity of the problem using the data they have.

Also in the documentation for polygon_output there is a typo (isocrhones instead of isochrones)

Hi @richardneilbelcher . Thanks for your feedback. It's much appreciated! I've improved the error message, which now includes the origin point that throws the error.

Regarding the other point you raised, please note that the function considers not the centroid each road, but the centroids of every road segment. Yes, the number of segments might vary depending on the input of OSM, but really long roads tend to have multiple segments. You can see the entire network and the segments using r5r::street_network_to_sf().

network_edges <- r5r::street_network_to_sf(r5r_core)$edges

Having said that I'm not entirely comfortable with adding yet another parameter to sample road segments. We already have way too many parameters, this is a very specific corner case and it would make the function much much slower because it would sample every road in the entire region and calculate travel times to all nodes. I'm not sure I would have bandwidth to implement the idea at this stage.

ps. thanks for cacthing the typos!

That makes sense. It could be something for the user to fix in the pbf file before uploading too. Thanks, Richard