google/transit

Best practice for the use of shapes

Opened this issue · 6 comments

GTFS allows to publish shapes as it is an abstract thing. Technically it is even possible to share shapes between trips that do not have the same stop sequence, but follow the same route-over-the-road or route-over-the-rail and use parts of a shape. For long distance busses shapes.txt might be enormously dense, copying many highways and city entrances, while the opposite might also be true and the shape would be more something that is intended for map matching.

  1. shapes.txt is currently a list, there is no file that defines the properties of a shape_id. I think we must introduce a new file to facilitate this, to allow to define if a shape is specific or abstracted, and is shared between unrelated stop sequences.
  2. The way how shapes are designed there is no formal requirement that at a stop it must be split (hence a shape point) exist, because the stop_times.txt defines an offset. I think it would be a very good best practice that such point does exist, hence a straight road with three stops would not just have two shape points (at the beginning and end) but also a point on which the stop can be projected

Related discussion recently - https://mobilitydata-io.slack.com/archives/C3FFFKX9C/p1714674761313129

This could be related to the distance-based fares feature, which is currently in our GTFS-Fares v2 backlog.

e-lo commented

shared between unrelated stop sequences.

I'm curious about the use case for needing to know this?

e-lo commented

define if a shape is specific or abstracted

Can't this be inferred from the density of points?

e-lo commented

I think it would be a very good best practice that such point does exist, hence a straight road with three stops would not just have two shape points (at the beginning and end) but also a point on which the stop can be projected

Totally agree.

e-lo commented

Expanding on this idea might also be the ability to map shape points to stops so that the vehicle graph and person graph have a formal relationship. Right now we just have a person-graph.

This could be done very redundantly (and prone to error) in shapes.txt or through an explicit table of roadway nodes and altering shapes.txt requirements such that:

nodes.txt: node_id, node_lat, node_lon <-- would be a better place for many things currently in stops.txt like shape point for pathways.

stops.txt: ...vehicle_node_id <--- maps each stop to a vehicle node that it can access. Note that this would enforce a 1 node:X stop relationship here ...which would imply that each boarding zone / platform has its own id. This likely makes more sense than putting it in nodes b/c if nodes are abstracted to intersections (as they often are), you can have multiple stops at different sides of intersections.

vehicle_shapes.txt (or redefined shapes.txt: veh_shape_id, node_id, shape_seq, shape_dist_travelled <-- would make sure lat/long for the shapes are the same and recognize that the vehicles are stopping at the same place even though they might be different routes or shapes.

**note that this is often done outside of the GTFS context in many transit modeling scenarios...

shared between unrelated stop sequences.

I'm curious about the use case for needing to know this?

It is a meta discussion. If a shape is actually shared it cannot be converted 1:1. The main problem that I have with this is that GTFS is highly denormalised, except for shapes.txt and frequencies.txt. And I understand why you want to reuse a shape, and you could compute that there is reuse, but it becomes more questionable if you want to compute if there is an error within that reuse.

Can't this be inferred from the density of points?

I think I don't want to trust on inferring information. And to be honest, I think we should do better (as a standard) between map matching and full precision.

I think the main question is if we want to move GTFS towards a topology model (with links) or not, since it is now topographic.