More than one GTFS-rt TripUpdate per vehicle is not supported
Opened this issue · 5 comments
For develop-freq
branch:
We've been working on adding a new agency (PSTA) to OBA Tampa, and have been working with their GTFS-rt feed under development, for schedule-based trips. One big difference between the feed that they are providing (http://97.76.252.61:8080/gtfsrt/trips?debug) vs. other feeds we've seen is that they are providing predictions both within the trip the vehicle is currently traveling as well as future trips in the same block.
When we provide this feed with more than one trip per vehicle to OBA, we see the following output:
2015-03-23 18:34:15,940 INFO [GtfsRealtimeSource.java:232] : refreshing http://97.76.252.61:8080/gtfsrt/trips
2015-03-23 18:34:16,463 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2903; taking newest.
2015-03-23 18:34:16,463 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 821; taking newest.
2015-03-23 18:34:16,463 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2531; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2718; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2715; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2308; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2717; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2711; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2611; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2114; taking newest.
2015-03-23 18:34:16,464 WARN [GtfsRealtimeTripLibrary.java:139] :
Multiple TripUpdates for vehicle 2716; taking newest.
So, from our look at the code, it appears that for each poll of the GTFS-rt TripUpdates feed OBA is throwing out the older (timestamp-wise) of the two TripUpdates for the same vehicle running multiple trips in the block. Multiple trip_ids for the same vehicle in the same GTFS-rt TripUpdate feed is legal in GTFS-rt, so we should consume and process these updates.
@sheldonabrown I went ahead and pushed the develop-freq
branch from the camsys repo as a new branch in this repo, since as we discussed this is probably the best branch to start from (the code for the combining of vehicle/trip GTFS-rt updates is significantly different in this branch than in master, introduced by @kurtraschke in 0b6bc53 to support frequency-based GTFS-rt updates as well):
https://github.com/OneBusAway/onebusaway-application-modules/tree/develop-freq
We see the same output showing the collision of multiple TripUpdates per vehicle in both the master
and develop-freq
branches. The above PSTA GTFS-rt is not frequency-based, but we're also introducing another agency (USF Bull Runner) into OBA Tampa that is frequency-based (type 0), another reason why we want to start there.
For schedule-based trips - to start further discussion on this, @cagryInside will submit a PR with one possible solution he tested that is based on the develop-freq
branch. This solution concatenates the trip_id
and vehicle_id
as the key in the HashMap (instead of just vehicle_id
), so that more than one TripUpdate per vehicle survives the groupTripUpdatesAndVehiclePositions()
method. We no longer get the collision output in the console for PSTA's schedule-based TripUpdates after applying this fix. We'd welcome feedback on this (particularly from @kurtraschke, @sheldonabrown, or @bdferris) and possible alternate/improved ways to handle this.
For frequency-based trips - based on my understanding of GTFS-rt frequency-based (type 0) updates, if we go with the combined HashMap key implementation, we may want to combine trip_id
+ vehicle_id
+ start_time
, if these fields are all provided. See an extended discussion here why start_time
is needed. We have not tested this solution for either frequency or schedule based trips.
Note that this is related to, but not the same as, #127. This issue targets supporting more than one TripUpdate per vehicle (no matter how many stop_time_updates are included), while #127 targets exposing more than one stop_time_update per trip via the onebusaway-api-webapp.
If I'm following the logic of the patch correctly, this results in two VehicleLocationRecord
instances being pushed to the VehicleLocationListener
with the same vehicle ID (and block ID); the implementation in VehicleStatusServiceImpl
appears to index the submitted VLRs by vehicle ID, so I think this leads to a collision there. A few steps down the call stack, it looks like the same thing happens with VehicleLocationRecordCacheImpl
.
At the risk of making things more complex, I think the ideal solution (given the way OBA's data model is oriented to blocks and vehicles) where we have one VehiclePosition
and >= 1 TripUpdate
s which all refer to the same vehicle and block is to aggregate them into one VehicleLocationRecord
- from the perspective of preserving the individual StopTimeUpdate
entries, this would probably require adding trip ID as a field to TimepointPredictionRecord
(perhaps also TimepointPredictionBean
). There'd also have to be some additional logic on the output side, in the GTFS-realtime feeds in onebusaway-api-webapp
and onebusaway-transit-data-service-webapp
to filter out predictions for stops not on the current trip. The implementation in BlockLocationServiceImpl.getBlockLocation()
actually indexes predictions by the scheduled arrival time, so the trip ID is irrelevant there.
The recently-released TripUpdate
feed for Big Blue Bus also triggers this issue - for each vehicle, it includes the vehicle's current trip as well as a prediction for the first stop of the next trip, where there is a next trip.
I think I've managed to convince myself that this issue is unique to the develop-freq
branch - I believe develop
doesn't have this problem.
In develop
branch GtfsRealtimeTripLibrary.applyTripUpdatesToRecord()
, we save all the GTFS-rt stop_time_updates as TimepointPredictionRecords
:
for (StopTimeUpdate stopTimeUpdate : tripUpdate.getStopTimeUpdateList()) {
BlockStopTimeEntry blockStopTime = getBlockStopTimeForStopTimeUpdate(
tripUpdate, stopTimeUpdate, blockTrip.getStopTimes(),
instance.getServiceDate());
if (blockStopTime == null)
continue;
StopTimeEntry stopTime = blockStopTime.getStopTime();
int currentArrivalTime = computeArrivalTime(stopTime,
stopTimeUpdate, instance.getServiceDate());
if (currentArrivalTime >= 0) {
updateBestScheduleDeviation(currentTime,
stopTime.getArrivalTime(), currentArrivalTime, best);
long timepointPredictedTime = instance.getServiceDate() + (currentArrivalTime * 1000L);
TimepointPredictionRecord tpr = new TimepointPredictionRecord();
tpr.setTimepointId(stopTime.getStop().getId());
tpr.setTimepointPredictedTime(timepointPredictedTime);
// Save the GTFS-rt stop_time_updates as TimepointPredictionRecords
timepointPredictions.add(tpr);
}
int currentDepartureTime = computeDepartureTime(stopTime,
stopTimeUpdate, instance.getServiceDate());
if (currentDepartureTime >= 0) {
updateBestScheduleDeviation(currentTime,
stopTime.getDepartureTime(), currentDepartureTime, best);
}
}
Then, in in BlockLocationServiceImpl.getBlockLocation()
we loop through all the TimepointPredictionRecords
and put the schedule deviations for each prediction into a SortedMap
, with the scheduled arrival time as the sorted key:
if (timepointPredictions != null && !timepointPredictions.isEmpty()) {
// The SortedMap sorts deviations by scheduled arrival times, based on the GTFS-rt stop_time_updates
SortedMap<Integer, Double> scheduleDeviations = new TreeMap<Integer, Double>();
BlockConfigurationEntry blockConfig = blockInstance.getBlock();
for (TimepointPredictionRecord tpr : timepointPredictions) {
AgencyAndId stopId = tpr.getTimepointId();
long predictedTime = tpr.getTimepointPredictedTime();
if (stopId == null || predictedTime == 0)
continue;
for (BlockStopTimeEntry blockStopTime : blockConfig.getStopTimes()) {
StopTimeEntry stopTime = blockStopTime.getStopTime();
StopEntry stop = stopTime.getStop();
if (stopId.equals(stop.getId())) {
int arrivalTime = stopTime.getArrivalTime();
int deviation = (int) ((tpr.getTimepointPredictedTime() - blockInstance.getServiceDate()) / 1000 - arrivalTime);
// Store predicted in SortedMap, with scheduled arrival time as key
scheduleDeviations.put(arrivalTime, (double) deviation);
}
}
}
Because we're using the SortedMap
, the predictions should end up being sorted by arrival time for each block. So, all TripUpdates for a vehicle should appear in the SortedMap
(as deviations) in correct order.
I believe this should also handle the case where TripUpdates occur in the GTFS-rt feed out-of-order.
For example, if you have Trip A and B in that order in a block, the GTFS-rt feed may have them in reverse sequential order:
{Trip B}
{Trip A}
We process them out-of-order as TimePointPredictions
, but the predictions get sorted in order of scheduled arrival time in the block in the SortedMap
.
@kurtraschke Does all this seem right to you?
@barbeau I concur with your analysis; also, it appears that the concerns I raised in my previous comment are not applicable to develop
either. So, it looks like this is a general deficiency I introduced in the reworked GtfsRealtimeTripLibrary
for frequency-based trip support.
@kurtraschke Ok, thanks for looking at this again. Our highest priority right now in Tampa is scheduled-based systems, so we're aiming at deploying the fixes related to the per stop predictions (#127, #138, #139) first. After that we hope to revisit your work in the frequency-based develop-freq
branch and get that integrated as well.