stop_times.shapes_dist_traveled shouldn't be defined if the trip doesn't have shapes associated
Opened this issue ยท 7 comments
Context
The spec definition of stop_times.shapes_dist_traveled
says (reference):
Actual distance traveled along the associated shape, from the first stop to the stop specified in this record. This field specifies how much of the shape to draw between any two stops during a trip. Must be in the same units used in shapes.txt.
We've observed cases where a stop_times.shapes_dist_traveled
is specified for trips that don't have a shape associated (no shape_id
for the trip_id
referenced in stop_times.txt
).
trips.txt
route_id | service_id | trip_id | shape_id |
---|---|---|---|
route_a | regular | trip_1 | |
route_a | regular | trip_2 | |
route_a | regular | trip_3 |
stop_times.txt
trip_id | stop_id | stop_sequence | shape_dist_traveled |
---|---|---|---|
trip_1 | stop_3 | 3 | 0 |
trip_1 | stop_4 | 4 | 298 |
trip_1 | stop_5 | 5 | 1029 |
Although this doesn't break anything, it seems like it's never intentional, and could potentially mean that a shape was intended to be associated.
Proposed solution
We would like to amend the specification with a "should" statement and add a check in the Canonical GTFS Schedule Validator with a WARNING severity level.
The new statement could look like this:
shape_dist_traveled
should not be specified if thetrip_id
value does not have ashape_id
defined intrips.txt
.
Questions we'd like the community's input on before proposing a spec amendment
- Are we in agreement that this should be flagged as a WARNING?
- Do we also need
shapes.shape_dist_traveled
to be defined to make use ofstop_times.shape_distance_traveled
?
Nope. Because shape distance can also be used to describe the distance traveled for example for fare computation. An actual geographical shape is not required for this.
And second also nope. While it does improve linear referencing it is not required.
Thank you for this info, @skinkie!
If this is accurate, then we would still propose a spec amendment because currently, the spec implies there is a shape.
Curious to have input from others on this one, especially consumers & orgs that see lots of GTFS data (cc @npaun, @bdferris-v2, @flocsy, @drewda, @e-lo, @westontrillium)
@skinkie's use case for defining distance traveled without a shape is one I hadn't considered. Regardless, isn't there precedent to have validation warnings for things that do not inherently violate best practices/hard spec rules, and may in fact be purposive, but that should still be reviewed due to the likelihood of it being in error?
@westontrillium There's precedent, but it's been making the boundaries of the definitions unclear. We'd like to change this. There's an INFO severity outside of ERROR and WARNING that we've been using to flag possible issues with a feed that are not explicitly recommended or banned in the spec or best practices. Info notices here.
It sounds like this issue may be a good candidate for the INFO severity level, if there's alignment that it is not a "should" to have a shape_id
for the trip_id
referenced in stop_times.txt
when the trip has an associated stops_times.shape_dist_traveled
INFO sounds like the perfect fit, in my estimation.
Hello,
We would like to amend the spec to better reflect what @skinkie highlighted. Here is a proposal:
Actual distance traveled along the trip, from the first stop to the stop specified in this record.
If used alongside shapes.txt, this field specifies how much of the shape to draw between any two stops during a trip and it must be in the same units used inshapes.shape_dist_traveled
.
Thoughts?