improving flexibility of filter_by_route_id
Joaobazzo opened this issue · 1 comments
Hi Daniel,
I've tried to use filter_by_route_id
function in the GTFS of São Paulo to filter for bus trips, but the function seems to be very restrictive with the quality of GTFS. You can reproduce this example using IPEA server.
Instead of using filter_by_route_id
, my workaround was to select the shape_id
s related to bus trips, which worked for me.
library(data.table)
spo_gtfs <- gtfstools::read_gtfs("L://Proj_acess_oport//data-raw//gtfs//spo//2019//gtfs_spo_sptrans_2019-06.zip")
format( object.size(spo_gtfs), "Mb")
# [1] "48.3 Mb"
# using standard solution
spo_gtfs1 <- gtfstools::filter_by_route_id(spo_gtfs,route_id = "3")
format( object.size(spo_gtfs1), "Mb")
# [1] "0 Mb"
# filter by shapes'ids
temp_routeid <- spo_gtfs$routes[route_type == 3,route_id]
temp_shapeids <- spo_gtfs$trips[route_id %in% unique(temp_routeid),shape_id]
spo_gtfs2 <- gtfstools::filter_by_shape_id(spo_gtfs,
+ shape_id = unique(temp_shapeids))
format( object.size(spo_gtfs2), "Mb")
# [1] "48 Mb"
I'm wondering whether or not we could adjust filter_by_route_id
in order to be not so much restrictive, given the fact the GTFS usually presents errors or missing information in calendars, calendar_dates, or agency files. Let me know your thoughts on it. Thanks!
I think I actually used the wrong function: filter_by_route_id
instead of filter_by_route_type
. No worries, the function is working fine rsrs