ipeaGIT/gtfstools

improving flexibility of filter_by_route_id

Joaobazzo opened this issue · 1 comments

Hi Daniel,
I've tried to use filter_by_route_id function in the GTFS of São Paulo to filter for bus trips, but the function seems to be very restrictive with the quality of GTFS. You can reproduce this example using IPEA server.

Instead of using filter_by_route_id, my workaround was to select the shape_ids related to bus trips, which worked for me.

library(data.table)
spo_gtfs <- gtfstools::read_gtfs("L://Proj_acess_oport//data-raw//gtfs//spo//2019//gtfs_spo_sptrans_2019-06.zip")

format( object.size(spo_gtfs), "Mb")
# [1] "48.3 Mb"

# using standard solution
spo_gtfs1 <- gtfstools::filter_by_route_id(spo_gtfs,route_id = "3")

format( object.size(spo_gtfs1), "Mb")
# [1] "0 Mb"

# filter by shapes'ids
temp_routeid <- spo_gtfs$routes[route_type == 3,route_id]

temp_shapeids <- spo_gtfs$trips[route_id %in% unique(temp_routeid),shape_id]

spo_gtfs2 <- gtfstools::filter_by_shape_id(spo_gtfs,
+                                           shape_id = unique(temp_shapeids))

format( object.size(spo_gtfs2), "Mb")
# [1] "48 Mb"

I'm wondering whether or not we could adjust filter_by_route_id in order to be not so much restrictive, given the fact the GTFS usually presents errors or missing information in calendars, calendar_dates, or agency files. Let me know your thoughts on it. Thanks!

I think I actually used the wrong function: filter_by_route_id instead of filter_by_route_type. No worries, the function is working fine rsrs