Create filter by day of the week
rafapereirabr opened this issue · 4 comments
Here we have a filter_week_days()
function in the gtfs2gps package we could use as base.
Probably gonna do something like filter_by_weekday(gtfs, weekday, keep)
, where weekday
can be any of c("mon", "tue", "wed", "thu", "fri", "sat", "sun")
.
Looking at the function already implemented in the gtfs2gps package, the new function should also be concerned with GTFS with only the calendar_dates file (without calendar).
filter_by_weekday()
has been introduced in 2cab7b0. It includes a combine
argument to control whether you want to use OR or AND when filtering by multiple days of the week. From the function examples:
# read gtfs
data_path <- system.file("extdata/spo_gtfs.zip", package = "gtfstools")
gtfs <- read_gtfs(data_path)
object.size(gtfs)
#> 811304 bytes
# keeps entries related to services than run EITHER on monday OR on sunday
smaller_gtfs <- filter_by_weekday(gtfs, weekday = c("monday", "sunday"))
smaller_gtfs$calendar[, c("service_id", "monday", "sunday")]
#> service_id monday sunday
#> 1: USD 1 1
#> 2: U__ 1 0
#> 3: US_ 1 0
#> 4: _SD 0 1
#> 5: __D 0 1
#> 6: USD 1 1
#> 7: U__ 1 0
#> 8: US_ 1 0
#> 9: _SD 0 1
#> 10: __D 0 1
object.size(smaller_gtfs)
#> 811248 bytes
# keeps entries related to services than run on monday AND on sunday
smaller_gtfs <- filter_by_weekday(
gtfs,
weekday = c("monday", "sunday"),
combine = "and"
)
smaller_gtfs$calendar[, c("service_id", "monday", "sunday")]
#> service_id monday sunday
#> 1: USD 1 1
#> 2: USD 1 1
object.size(smaller_gtfs)
#> 762152 bytes
# drops entries related to services than run EITHER on monday OR on sunday
# the resulting gtfs shouldn't include any trips running on these days
smaller_gtfs <- filter_by_weekday(
gtfs,
weekday = c("monday", "sunday"),
keep = FALSE
)
smaller_gtfs$calendar[, c("service_id", "monday", "sunday")]
#> service_id monday sunday
#> 1: _S_ 0 0
#> 2: _S_ 0 0
object.size(smaller_gtfs)
#> 19912 bytes
# drops entries related to services than run on monday AND on sunday
# the resulting gtfs may include trips that run on these days, but no trips
# that run on both these days
smaller_gtfs <- filter_by_weekday(
gtfs,
weekday = c("monday", "sunday"),
combine = "and",
keep = FALSE
)
smaller_gtfs$calendar[, c("service_id", "monday", "sunday")]
#> service_id monday sunday
#> 1: U__ 1 0
#> 2: US_ 1 0
#> 3: _SD 0 1
#> 4: __D 0 1
#> 5: _S_ 0 0
#> 6: U__ 1 0
#> 7: US_ 1 0
#> 8: _SD 0 1
#> 9: __D 0 1
#> 10: _S_ 0 0
object.size(smaller_gtfs)
#> 69880 bytes
For now it only uses the calendar table to filter, as using the calendar_tables adds a lot of complexity to the function. I'm closing this issue for now, but if this becomes a problem in the future we can tackle it later.
Brilliant !