remix/partridge

"read_trip_ids_by_day" - custom service day hours

adiwaz opened this issue ยท 9 comments

Feature request

Adding an option to get "trip id by day", considering a custom "service day" hours (not necessarily midnight to midnight).

Description

"Service day" definition may vary between applications.
For example, it is more convenient to define a service day of 4AM to 4AM of the next date, in order to analyze GTFS for working days.
To answer the question: "When is the earliest departure of route x at date y" it might be irrelevant that there is a departure of route x on 00:30 AM of date y.

Suggestion

An option for implementation is to filter trip_ids by something similar to the following method, copied from Paul Harrington's post on https://groups.google.com/forum/#!topic/transit-developers/ZkfnuNv1gho :
"When searching for next departures at a stop at a time between midnight and 3am I rolled back a day and added 24 hours to the hour so instead of running a query for departures at a stop starting from a date of 20180124 and a time of 01:00:00 I would use 20180123 and 25:00:00"

The usage could be something like adding "read_trip_ids_by_day" method, which is similar to "read_service_ids_by_date", but also gets "day_start" and "day_end" hours as inputs.
Then we would be able to filter the "feed" by these trip ids.

import datetime
import partridge as ptg

path = 'path/to/sfmta-2017-08-22.zip'

trip_ids_by_day = ptg.read_trip_ids_by_day(path, day_start="03:00:00", day_end="02:59:59")

trip_ids = trip_ids_by_day[datetime.date(2017, 9, 25)]
# Now "trip_ids" should contain all the trip_ids of trips that start 
#  between 25-09-2017 03:00:00 and 26-09-2017 02:59:59

feed = ptg.feed(path, view={
    'trips.txt': {
        'trip_id': trip_ids,
    },
})

Thanks!!

This is a neat idea. I will think about this some more over the weekend. Should be able to use an adjacency matrix of dates to properly distribute trips.

One thought to tighten up the interface is to collapse the start and end times into one cutoff which splits the day into two parts.

Update: I haven't had much time to tinker with this yet. Anyone interested in implementing and sending a PR?

Could you explain please what did you mean about the adjacency matrix of dates?

I made a diagram of how I understand this feature request. Imagine a simple feed with service on only three dates. Trips starting during green times would be assigned to June 9th, yellow trips are assigned to June 10th, and red to June 11th.

Does this diagram and explanation match your idea of this feature?

img_6834

Oops. I closed this accidentally.

Yes this diagram matches my idea :)
I implemented this and will send you soon a PR

Hi @adiwaz. Just checking in on your progress.

Hi!
I implemented the feature in my fork, branch "trip_id_by_date".
The main logic is in "_trip_ids_by_day" function
It is only partly tested, this is why I didn't send a PR. Unfortunately I don't think that I will have time soon to write tests for it...

I'm going to close this issue. Happy to reconsider at some point, but at this time I consider this out of the scope of the project. Thank you for the proposal!