opengeospatial/ogc-geosparql

Multiple CRS serializations in a single `geo:Geometry`

Closed this issue · 17 comments

I'm merging several databases on the same physical assets, but with differing geometries and measurement instructions. We're using Geosparql, so each asset relates to multiple geo:Geometrys through the use of a sub-property of geo:hasGeometry. This follows e.g. #5.

Then, for each geometry we want to publish two serializations, something we're less sure about. A WGS-84 representation for web application that can't handle other CRSs and the authentic original in EPSG 28992 (RijksDriehoek). (In-query CRS transformation is not feasible.) For the latter geometry we use a WKT literal, with the CRS-reference inline, so that it's still a valid geo:wktLiteral. Examples below.

# in the asset dataset
ex:asset123
  ex:hasDB1xy  ex:asset123-xy ;
  ex:hasDB2shape ex:asset123-shape .

id:asset123-xy a geo:Geometry ;
  geo:asWKT  "POINT (...)"^^geo:wktLiteral ; 
  ex:asWKT-RD "<http://www.opengis.net/def/crs/EPSG/0/28992> POINT (...)"^^geo:wktLiteral .

id:asset123-shape a geo:Geometry ;
  geo:asWKT  "POLYGON (...)"^^geo:wktLiteral ;
  ex:asWKT-RD "<http://www.opengis.net/def/crs/EPSG/0/28992> POLYGON (...)"^^geo:wktLiteral .

# in the accompanying ontology
ex:hasDB1xy rdfs:subPropertyOf geo:hasGeometry .
ex:hasDB2shape rdfs:subPropertyOf geo:hasGeometry .
ex:asWKT-RD rdfs:subPropertyOf geo:asWKT .

I couldn’t find prior art in this exact fashion (pdok:asWKT-RD was next to geo:asWKT, but had neither the geo:wktLiteral datatype nor the CRS-in-WKT.). @wouterbeek and I felt that multiple geo:Geometrys for different serializations was mixing concepts for this purpose.

I’d like to check if this is in your eyes too a valid way to model multiple serializations with multiple geometries. For example #31, that issues discusses decoupling CRS and WKT values, but in our view, multiple serializations are possible for a single geo:Geometry.

Hello @redmer. A Geometry in GeoSPARQL is characterised by a set of coordinates in a certain Coordinate Reference System (CRS) (we are working on clarifying definitions, see issue 10). That means that a single Geometry can have multiple serialisations, as long as those serialisations use the same coordinates and the same CRS. That is not the case in the example.

Would it help if the Geometry class in GeoSPARQL is extended to include properties for CRS and type (e.g. point, polygon, linestring, ...)? It seems there would be less need to define your own ontology terms then.

situx commented

Now that I read this @FransKnibbe does that mean that whenever a geometry has a GeoJSON or KML literal all other literals have to be in WGS84?
Seems like that to me

Yes, I think the conclusion is inescapable: as long as GeoJSON only permits using urn:ogc:def:crs:OGC::CRS84 (specified in RFC 7946, and the same as http://www.opengis.net/def/crs/OGC/1.3/CRS84, which is the default CRS for wktLiteral in GeoSPARQL 1.0), all other literals of a geo:Geometry with a GeoJSON serialisation should use that CRS.

situx commented

Then I believe we should make that very clear in the examples and in the specification by adding some notes

Then I believe we should make that very clear in the examples and in the specification by adding some notes

Agreed. And let's not forget the idea of including a validator (issue 113), which can also help it communicating meaning and intent.

Thanks for your reply, @FransKnibbe. This fully answers our question. I’ve looked up the reference in the current ontology and indeed there it’s stated that its CRS specific. For our implementation that would mean not 2 instances of geo:Geometry per ex:asset123 but 4 instead. I can see the reasoning too.

I've seen the geo:inCRS property in the updated spec, that would indeed be very useful.

The subtypes of geo:Geometry are less useful for the present use-case, as the source databases use a complex arrangement of geometry types and asset classes. We might make our own subclasses of it, though.

another benefit of having 4 geo:Geometry instances is that you have an indentifier for each individual geometry, which is useful when building a viewer app (e.g. dynamically show/hide based on accuracy of the geometry). You can now also link geometries that are derived from a "master" geometry. For the latter case, there is an open issue to define such terminology for asserting relations between geometries

Thanks for clarifying @FransKnibbe !

In practice I observe that geometries are used with multiple shape serializations in different CRSes. Such usage even makes it into the Spatial Data on the Web Best Practices document, example 17.

I see two possible solution directions for future GeoSPARQL standardization here:

  1. The current alignment with ISO 19107 is maintained, the human-readable description of geo:Geometry is improved to clearly explain that geometries must have exactly one CRS, a machine-readable validation for this is introduced using SHACL.
  2. The current alignment with ISO 19107 is dropped, geo:Geometry is defined as a conceptual 1D/2D/3D position that can be serialized in one or more CRSes.

You seem to default to solution direction (1), which is understandable because of alignment with GeoSPARQL 1.0 and ISO 19107. But solution direction (2) also has benefits: closer to how geometries are currently used in some places, consistent with all examples in Spatial Data Best Practices, more in line with other linked data usage (e.g., one IRI can have the same label expressed in multiple natural languages, where language tags are the analogue of CRSes).

In addition to the above: if geometries are intended to have exactly one CRS, then the need arises to have meaningful properties that can connect geometries that only differ in their CRS.

I see a parallel here with other linked data usage, for example sdo:thumbnail connects two sdo:ImageObjects, where the former is the full image (e.g., using TIFF format) and the latter is a downsized version intended for web use (e.g., using PNG format).

This will allow datasets that current have one geometry that contains two serializations in two CRSes to upgrade to a new version with two geometries that are appropriately connected.

For example, from:

:oldGeometry
  a geo:Geometry;
  geo:asWKT 
    #  for GIS users
    "<http://www.opengis.net/def/crs/EPSG/0/28992> Point (...)"^^geo:wktLiteral;
    # for Web users
    "Point (...)"^^geo:wktLiteral.

To:

#  for GIS users
:newGeometry1
  a geo:Geometry;
  "<http://www.opengis.net/def/crs/EPSG/0/28992> Point (...)"^^geo:wktLiteral,
  geo:hasGeneralizedVersion :newGeometry2.
# for Web users
:newGeometry2
  a geo:Geometry;
  geo:asWKT "Point (...)"^^geo:wktLiteral.

Hello @wouterbeek!

I see two possible solution directions for future GeoSPARQL standardization here:

  1. The current alignment with ISO 19107 is maintained, the human-readable description of geo:Geometry is improved to clearly explain that geometries must have exactly one CRS, a machine-readable validation for this is introduced using SHACL.
  2. The current alignment with ISO 19107 is dropped, geo:Geometry is defined as a conceptual 1D/2D/3D position that can be serialized in one or more CRSes.

You seem to default to solution direction (1), which is understandable because of alignment with GeoSPARQL 1.0 and ISO 19107. But solution direction (2) also has benefits: closer to how geometries are currently used in some places, consistent with all examples in Spatial Data Best Practices, more in line with other linked data usage (e.g., one IRI can have the same label expressed in multiple natural languages, where language tags are the analogue of CRSes).

Everything should be open for discussion. But I think direction 2 can have serious ramifications, as it would be a very fundamental change. For example, consider the topological SPARQL functions in GeoSPARQL. In the case multiple CRSs (and therefore multiple coordinate collections) are allowed for a single Geometry, how could a function like geof:sfEquals (which determines if two geometries are the same) work?

For direction 2 the comparison with language tags already works, from different perspective. XML and JSON can be considered two different data serialisation languages, and using them both for a single Geometry instance is allowed.

As for connecting geometries that only differ in their CRS, the obvious connection is their relation with a geo:Feature that they describe. Although a Geometry can exist on its own (one could think up a triangle with certain coordinates that has no relationship to any 'real world object'), I think in most cases a Geometry exists as a representation of a spatial thing (e.g. a river or a building or an orange). So a typical spatial dataset will have Feature instances that have one or more Geometries (e.g. for different CRSs, for different levels of detail, for different dimensionalities, for different roles). But I think it is a fair point to make that the connection gets lost if one encounters a Geometry instance on its own. Currently there is no property available to link a Geometry back to its Feature. A change request for adding such a property (an inverse of geo:hasGeometry) does not exist yet, as far as I know. But it could make sense to make one. Would it help you?

GeoSPARQL 1.1 will have more specialized ways of relating a Feature to a Geometry, for example geo:hasCentroid and geo:hasBoundingBox. I wonder if it would be a good idea to change the rdfs:domain of those properties to geo:SpatialObject, making them applicable to both Geometry and Feature.

situx commented

I think language tags do not work here:
If we were to use something like a language tag then we could think of e.g. "POINT(0 0)"@geojson_epsg4326
But I think we would, first of all, loose the geometry literal type here, and actually, the EPSG has to be represented as a URI as well.
This means the "language" tag would need to represent two different kinds of information both encoded as URIs.
So if one wants to take this approach, the only way is to encode the CRS information within the literal, unless I am missing something else here.

You could potentially define a datatype URI for each serialization, CRS pair. Something like "POINT(0 0)"^^ogc:wktLiteral_epsg4326, but then you would have tons of datatypes. I recall that there was some discussion of this idea in the early days of GeoSPARQL 1.0.

situx commented

Exactly, that could be an option, but the amount of data types is infinitely large because a CRS URI can according to the standard be an xsd:anyURI

so conclusion would be to keep separate geo:Geometry instances in the above case? Again, this feels like the best choice. Also if you're deriving a geometry in CRS 2 from the "main" geometry in CRS 1, both related to the same geo:Feature there might be a difference in accuracy due to rounding errors of the transformation.

@FransKnibbe

Currently there is no property available to link a Geometry back to its Feature. A change request for adding such a property (an inverse of geo:hasGeometry) does not exist yet, as far as I know. But it could make sense to make one.

Personally think that adding inverses only increases the ontology size without adding much to it. A directed graph such as an RDF graph can always be traversed in the opposite direction (e.g. using SPARQL, you can write simple reverse properties using the following syntax: ^geo:hasGeometry). PROV-O refrained from adding inverses for a similar reason I believe, but they proposed to define preferred URIs for inverse properties in the HTML documentation.

isGeometryOf was proposed and rejected here #4

In #141 I noticed that issues with geo:inCRS prevent its adoption for GeoSPARQL 1.1.

Our final solution fortunately doesn’t rely on it, as we only have a sub-property of geo:asWKT to make the ""special"" geometry queryable without turning to string comparisons. We don’t use other serializations and therefore ex:asWKT-RD suffices.

ex:asset123
  geo:hasGeometry ex:asset123-xy-wgs ;
  geo:hasGeometry ex:asset123-xy-rd .

ex:asset123-xy-wgs a geo:Geometry ;
  geo:asWKT  "POINT (...)"^^geo:wktLiteral .

ex:asset123-xy-rd a geo:Geometry ;
  ex:asWKT-RD "<http://www.opengis.net/def/crs/EPSG/0/28992> POINT (...)"^^geo:wktLiteral .

# in the accompanying ontology
ex:asWKT-RD rdfs:subPropertyOf geo:asWKT .
situx commented

I think this issue is resolved by both the SHACL shapes for GeoSPARQL 1.1 and a solution that was found by the issue owner. Closing....