google/transit

Using StopTimeEvent.uncertainty for non-timepoints

Opened this issue · 4 comments

Introduce yourself

Hi, I'm a developer at AC Transit trying to improve our GTFS-RT feeds.

Ask a question

My agency only holds timepoints as accurate trip stop times while non-timepoint trip stop times are approximate. This confuses riders as they see buses as sometimes early while that is not really the case. Sometimes a bus might arrive 3+ minutes early then leave but that is within scheduling's accuracy constraints, so people miss the bus (late is obviously less important but equally relevant for non-timepoint transfers for leg changes).

Is it a good practice to populate the uncertainty field when we know the typical spread of the approximation, leaving delay as 0 to cause a +/- seconds in effect?

The closest I found discussing this was:

Clarify definition of "frequency-based" trips and Add prediction certainty #111 and my question isn't clearly answered by the current definition @ message StopTimeEvent.

I think the uncertainty is more about the accuracy of the prediction, which sounds like it is high in your case.

It appears that the scheduled times on non-timepoint stops have low certainty (or none at all). I would rather try to improve how to communicate this to passengers, perhaps by not publishing a time or making it otherwise more obvious that times on these stops can vary a bit.

Is it common to have non-timepoint trip stops for transit agencies?

For us, it is not possible to designate every stop as a timepoint since not all stops are suitable for the operator to stop and catch up back on schedule. Timepoint locations are selected by Planners and Schedulers to be safe locations for the bus to be able to wait out of traffic. These are also selected by their proximity to major travel generators or near major streets and transfer locations. It is not industry best practice to designate every stop at a timepoint because this also makes the service more difficult for the road supervisors to manage.

Perhaps an uncertainty seconds field inside GTFS-Static stop_times.txt would be a better solution to this. The data would be beneficial to all data consumers.

Arrival times at time points can be just as uncertain as at non time points particularly if the bus is running late. The bus can be held to match the scheduled departure time if its running hot. So, it depends on how the data in the GTFS static files is to be used. Since the GTFS static files are used for multiple purposes and not just to provide predicted times to riders, I would much rather the scheduled times for all stops be included in the GTFS static times and the time point boolean filled out to show which stops are timepoints. For non time points the scheduled time can be done as simply as interpolating based on distance travelled or distance traveled and average traffic speed on the road segments traversed, or to improve accuracy by using a more sophisticated method based on the AVL and trip traces to interpolate using the historical patterns of the time between stops by time of day, direction, or even trip.

Jim Bunch

You are right in that uncertainty affects both timepoints and non-timepoints alike. Pauses at timepoints for early vehicles can realign schedule adherence while late vehicles have no means for realignment without speeding or off-route shortcuts. As we use the more sophisticated method as you mentioned for setting non-timepoint stop times, they are often limited by number of patterns and the manual creation of turn-by-turn directions (due to base map changes from scheduling to AVL systems).

Having thought about this more, you are right in that historical data can automatically generate the accuracy spread for [non]timepoints, although this takes an algorithm instead of a lookup table. One of the nice things about GTFS-Static is pre-calculating these types of data points so data consumers do not need to also do them. In fact, the arrival/departure times similarly can all be calculated using historical vehicle positioning analysis without need to put them into stop_times.txt, but that is compute inefficient.

My point is that having an accuracy range inside the stop_times.txt can help solve confusion in riders when the schedule is running as designed but the real-time information doesn't exist to explain the early/late-ness built into the system and making transfers less deterministic and thus frustrating.