CUTR-at-USF/bullrunner-gtfs-realtime-generator

Multiple trip updates appearing for same trip instance (loop instance)

barbeau opened this issue · 5 comments

@cagryInside found the following when providing the GTFS-rt Trip Updates feed (http://mobullity.forest.usf.edu:8088/trip-updates?debug) to OneBusAway:

Log: INFO  [GtfsRealtimeSource.java:232] : refreshing http://mobullity.forest.usf.edu:8088/trip-updates
WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle 2233; taking newest.
...
Gtfs feed:

entity {

  id: "17"

trip_update {

    trip {

      trip_id: "3"

      start_time: "11:16:58"

      schedule_relationship: UNSCHEDULED

      route_id: "B"

    }

   ...

    vehicle {

      id: "2233"

    }


entity {

  id: "18" trip_update {

    trip {

      trip_id: "3"

      start_time: "11:16:58"

      schedule_relationship: UNSCHEDULED

      route_id: "B"

    }

   ...

    vehicle {

      id: "2233"

    }

The combination of trip_id + start_time + vehicle_id should be unique (i.e., this is one "loop" of the frequency-based route B), so having two records with the exact same values is wrong. If the second trip_update refers to the next instance of the loop, then the start_time should be different (and should be equal to the time that the first prediction for this trip instance became visible from the Syncromatics API). See the GTFS-rt TripDescriptor semantics document for more detail.

@jmfield2 Could you please take a look at this?

The below data also seems suspicious - technically it's valid because the start_times for each of the trip_ids for the same vehicle are different, but they are only 30 seconds apart. We would normally expect the time difference to be approximately the amount of time it takes the vehicle to run one instance of the loop route:

entity {
  id: "9"
  trip_update {
    trip {
      trip_id: "14"
      start_time: "15:28:30"
      schedule_relationship: UNSCHEDULED
      route_id: "F"
    }
    stop_time_update {
      stop_sequence: 23
      arrival {
        time: 1427484000
      }
      stop_id: "517"
    }
    stop_time_update {
      stop_sequence: 24
      arrival {
        time: 1427484000
      }
      stop_id: "521"
    }
    stop_time_update {
      stop_sequence: 25
      arrival {
        time: 1427484060
      }
      stop_id: "527"
    }
    stop_time_update {
      stop_sequence: 26
      arrival {
        time: 1427484180
      }
      stop_id: "912"
    }
    stop_time_update {
      stop_sequence: 27
      arrival {
        time: 1427484240
      }
      stop_id: "906"
    }
    stop_time_update {
      stop_sequence: 28
      arrival {
        time: 1427484240
      }
      stop_id: "904"
    }
    stop_time_update {
      stop_sequence: 29
      arrival {
        time: 1427484300
      }
      stop_id: "446"
    }
    stop_time_update {
      stop_sequence: 30
      arrival {
        time: 1427484360
      }
      stop_id: "426"
    }
    stop_time_update {
      stop_sequence: 31
      arrival {
        time: 1427484420
      }
      stop_id: "418"
    }
    vehicle {
      id: "1329"
    }
  }
}
entity {
  id: "8"
  trip_update {
    trip {
      trip_id: "14"
      start_time: "15:28:00"
      schedule_relationship: UNSCHEDULED
      route_id: "F"
    }
    stop_time_update {
      stop_sequence: 1
      arrival {
        time: 1427484480
      }
      stop_id: "401"
    }
    stop_time_update {
      stop_sequence: 2
      arrival {
        time: 1427485080
      }
      stop_id: "421"
    }
    stop_time_update {
      stop_sequence: 3
      arrival {
        time: 1427485140
      }
      stop_id: "425"
    }
    stop_time_update {
      stop_sequence: 4
      arrival {
        time: 1427485200
      }
      stop_id: "445"
    }
    stop_time_update {
      stop_sequence: 5
      arrival {
        time: 1427485260
      }
      stop_id: "449"
    }
    stop_time_update {
      stop_sequence: 6
      arrival {
        time: 1427485320
      }
      stop_id: "905"
    }
    stop_time_update {
      stop_sequence: 7
      arrival {
        time: 1427485380
      }
      stop_id: "911"
    }
    stop_time_update {
      stop_sequence: 8
      arrival {
        time: 1427485560
      }
      stop_id: "526"
    }
    stop_time_update {
      stop_sequence: 9
      arrival {
        time: 1427485620
      }
      stop_id: "520"
    }
    stop_time_update {
      stop_sequence: 10
      arrival {
        time: 1427485740
      }
      stop_id: "518"
    }
    stop_time_update {
      stop_sequence: 11
      arrival {
        time: 1427485800
      }
      stop_id: "514"
    }
    stop_time_update {
      stop_sequence: 12
      arrival {
        time: 1427485860
      }
      stop_id: "510"
    }
    stop_time_update {
      stop_sequence: 13
      arrival {
        time: 1427485920
      }
      stop_id: "508"
    }
    stop_time_update {
      stop_sequence: 14
      arrival {
        time: 1427486040
      }
      stop_id: "504"
    }
    stop_time_update {
      stop_sequence: 15
      arrival {
        time: 1427486040
      }
      stop_id: "502"
    }
    vehicle {
      id: "1329"
    }
  }

So, I looked into this issue over the weekend and it seems to be originating from the way the generator updates the start_times for (route, vehicle) when a new sequence is received - that is, it updates the time for the previous instance with the current time for every new prediction received ... so, eventually and in some cases the previous time could = the current if the prediction didn't change ... I think.

My proposed solution which I'm testing locally still and will try to test on mobullity shortly if you think it could work is as follows: (jmfield2@9496679)

When a new prediction is recv'd for stop #1, update the current instance time to reflect the new data.
IFF the current instance time is 'old' enough (60*10 seconds, or 10 minutes) older than this new prediction time, then copy the current time to the previous time.

So far, it seems to be working as expected.

Any thoughts?

When a new prediction is recv'd for stop #1, update the current instance time to reflect the new data. IFF the current instance time is 'old' enough (60*10 seconds, or 10 minutes) older than this new prediction time, then copy the current time to the previous time.

@jmfield2 when you say "new prediction", do you mean that the predicted arrival time for stop_sequence=1 changes?

A changing predicted arrival time for stop_sequence=1 in a stop_time_update alone doesn't necessarily indicate the beginning of a new trip instance (since that predicted could change multiple times as the predictions are refined as the vehicle approaches the stop - but, as mentioned before, the start_time of the trip instance should never change after it is set). My understanding of the current implementation is that it should be looking for non-increasing arrival time values in the stop_sequences, and will split the trip instances on that non-increasing time value (although admittedly I haven't dug into the code myself). See opentripplanner/OpenTripPlanner#1347 (comment) discussion for a presentation of arrival times from the feed and how this tends to look for two different trip loop instances. Note that data errors in predictions could also potentially introduce more than one non-increasing value.

I think the "best" solution is probably to introduce some "reality-check" on the number of non-increasing values allowed to generate new trip instances for the same vehicle (in reality it should be 1 max, I believe, without digging deeper myself), in addition to the existing logic of splitting trip instances using the non-increasing values. This could also take the form of the hard time threshold you mention, to make sure we're not generating start_times that are way to close together. I'm not sure how long the Bull Runner normally takes to run a route, but my feeling is that 10 min is probably a reasonable threshold (and maybe even a little more) without getting too close to real circulation times.

Let me know if this isn't clear (maybe I'm not understanding your exact proposed solution either), and we can try to squeeze in a Hangout tomorrow or Wed.

@jmfield2 committed and deployed to mobullity my proposed solution for the duplicate start times in the trip-updates feed.

It looks like it fixed the start time problem. For example for the same trip and vehicle id, now we get different stop times:

entity {
  id: "18"
  trip_update {
    trip {
      trip_id: "13"
      start_time: "10:02:12"
      schedule_relationship: UNSCHEDULED
      route_id: "F"
    } ...
    vehicle {
      id: "3003"
    }
  }
}
entity {
  id: "19"
  trip_update {
    trip {
      trip_id: "13"
      start_time: "09:11:42"
      schedule_relationship: UNSCHEDULED
      route_id: "F"
    } ...
    vehicle {
      id: "3003"
    }
  }
}

However, we still don't update the trips in the OneBusAway (OBA https://github.com/OneBusAway/onebusaway-application-modules/tree/develop-freq). We already proposed and implemented a new flow for trip updat in frequency based systems (issue OneBusAway/onebusaway-application-modules#128 and PR OneBusAway/onebusaway-application-modules#129). Since both trip updates have same trip and vehicle id, we still skip the next trip in the OBA:

2015-04-09 09:59:13,591 INFO  [GtfsRealtimeSource.java:232] : refreshing http://mobullity.forest.usf.edu:8088/trip-updates
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 3003_13; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1329_13; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1123_8; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 4009_8; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1979_11; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 2102_1; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1980_11; taking newest.
2015-04-09 09:59:13,607 WARN  [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 2252_1; taking newest.

So, If this is the correct behavior for bull runner and other frequency based systems, we might want to update the OneBusAway project (issue OneBusAway/onebusaway-application-modules#128 and PR OneBusAway/onebusaway-application-modules#129). In this case we need to concatenate three parameters:
trip_id + vehicle_id + start_time

If this is not the desired behavior, we need to change the bullrunner-gtfs-realtime-generator. (I personally think that this is the correct behavior, and we need to update the OBA).

cc'd @barbeau

So, I checked this tonight and noticed an issue that I'll need to
investigate further:

entity {
id: "4"
trip_update {
trip {
trip_id: "1"
start_time: "21:27:13"
schedule_relationship: UNSCHEDULED
route_id: "A"
}

trip_update {
trip {
trip_id: "1"
start_time: "21:30:43"
schedule_relationship: UNSCHEDULED
route_id: "A"
}

The same vehicle on route A had a start time 3 minutes apart ... I'm
guessing this could be from bad syncromatics data or gaps, but I'm not sure
yet.

On Thu, Apr 9, 2015 at 10:48 AM, Cagri Cetin notifications@github.com
wrote:

@jmfield2 https://github.com/jmfield2 committed and deployed to
mobullity my proposed solution for the duplicate start times in the
trip-updates feed.

It looks like it fixed the start time problem. For example for the same
trip and vehicle id, now we get different stop times:

entity {
id: "18"
trip_update {
trip {
trip_id: "13"
start_time: "10:02:12"
schedule_relationship: UNSCHEDULED
route_id: "F"
} ...
vehicle {
id: "3003"
}
}
}

entity {
id: "19"
trip_update {
trip {
trip_id: "13"
start_time: "09:11:42"
schedule_relationship: UNSCHEDULED
route_id: "F"
}
vehicle {
id: "3003"
}
}
}

However, we still don't update the trips in the OneBusAway (OBA
https://github.com/OneBusAway/onebusaway-application-modules/tree/develop-freq).
We already proposed and implemented a new flow for trip updat in frequency
based systems (issue OneBusAway/onebusaway-application-modules#128
OneBusAway/onebusaway-application-modules#128
and PR OneBusAway/onebusaway-application-modules#129
OneBusAway/onebusaway-application-modules#129).
Since both trip updates have same trip and vehicle id, we still skip the
next trip in the OBA:

2015-04-09 09:59:13,591 INFO [GtfsRealtimeSource.java:232] : refreshing http://mobullity.forest.usf.edu:8088/trip-updates
2015-04-09 http://mobullity.forest.usf.edu:8088/trip-updates2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 3003_13; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1329_13; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1123_8; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 4009_8; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1979_11; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 2102_1; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 1980_11; taking newest.
2015-04-09 09:59:13,607 WARN [GtfsRealtimeTripLibrary.java:139] : Multiple TripUpdates for vehicle and trip 2252_1; taking newest.

So, If this is the correct behavior for bull runner and other frequency
based systems, we might want to update the OneBusAway project (issue
OneBusAway/onebusaway-application-modules#128
OneBusAway/onebusaway-application-modules#128
and PR OneBusAway/onebusaway-application-modules#129
OneBusAway/onebusaway-application-modules#129).
In this case we need to concatenate three parameters:
trip_id + vehicle_id + stop_time

If this is not the desired behavior, we need to change the
bullrunner-gtfs-realtime-generator. (I personally think that this is the
correct behavior, and we need to update the OBA).

cc'd @barbeau https://github.com/barbeau


Reply to this email directly or view it on GitHub
#8 (comment)
.