NetTopologySuite/NetTopologySuite.IO.PostGis

New ordinate handling behavior

roji opened this issue ยท 10 comments

roji commented

In #5 I addressed a problem with ordinates handling, but the issue seems wider, so I'm opening a discussion here.

Both PostGisWriter and PostGisReader have a HandleOrdinates property on them. These properties determine the ordinates sent and received,

  • Sending an XY point with an XYZ writer will send an XYZ point with a zero Z coordinate
  • Receiving an XY point with an XYZ reader will return an XYZ point with a NaN Z coordinate

In effect, there seems to be no way to simply send or receive a point, with all its ordinates (and nothing but its ordinates): the user must currently set the ordinates manually on the writer and reader beforehand. Aside from being odd and unnecessary, this makes it awkward to work in situations where geometry objects of varying ordinates are used: writing both 2D and 3D points involves having two writers and two readers (or constantly changing the ordinates properties on them).

Here's a proposal for a new behavior:

  • Treat the ordinate setting on the reader/writer only as a "maximum" value, allowing users to "crop" unwanted ordinates.
  • For example, if an XYZ point is sent with an XY writer, an XY point will be sent, discarding the Z coordinate. Similarly, if an XYZ point is received with an XY reader, an XY point will be returned. This is the current behavior - no change.
  • However, if the geometry object is missing a coordinate present in the reader/writing setting, it is not completed with either 0 or NaN. So an XY point will be sent as an XY point with an XYZ-configured writer. Similarly, reading an XY point with an XYZ-configured reader would return an XY point.
  • In effect, the ordinates sent and received will be a bitwise AND between the ordinates set on the reader/writer, and the actual ordinates present on the object.
  • As a special case, Ordinates.None would mean that the reader/writer have no particular setting, and would always simply pass through the object as-is, without adding or removing anything.
  • Ordinates.None would become the default behavior for PostGisWriter and PostGisReader.

Note that the Ordinates.None behavior is already somewhat present in PostGisWriter, but is disabled as described in #6.

If this is accepted, I'll submit a PR for the changes including tests.

/cc @austindrenski @YohDeadfall

Here's a proposal for a new behavior:

Yeah... I started thinking along the same lines right around the time that I wrote my comment in #5.

The hesitation I have is that, on the writer side, this differs from how both WKBWriter and SqlServerBytesWriter do it, and probably the other binary writers as well. The intent seems to be that you set HandleOrdinates to be exactly the ordinates that you need to see on the output side (even if we need to fill in placeholders for them), which could matter if the reader side demands that the data should conform to that shape.

That hesitation isn't a dealbreaker for me, since IMO, NetTopologySuite.IO.PostGis should strongly lean towards what npgsql needs/wants from it.

@FObermaier thoughts?

roji commented

@airbreather maybe it's worth having the conversation with the EF Core folks... The current behavior is problematic for anyone dealing with mixed ordinates in the same application, and I don't think there's anything PostgreSQL-specific in this. I also agree that it's desirable to have writers/readers behave in the same way (although that shouldn't block us from better behavior either).

@bricelam any thoughts/comments?

I believe the proposal is similar to what SqlServerBytes does. A geometry is an XYZ geometry if and only if it specifies a Z value (0 is a value; NaN/NULL is not).

SQLite is a trickier because you can't store an XY geometry in an XYZ column. The serialized value must include Z even if it's NaN.

@roji is Postgis lenient to allow XY geometries in XYZ/M columns?
Is there a use case for deliberatly mixing XY/XYZ/XYM/XYZM geometries in one column?
Even if this occurs, the model probably makes some assumptions about the geometries' ordinates. Thus the PostGIS geometry column should reflect that.

My use case for having the reader return exactly the ordinates specified in HandleOrdinates is that one might need e.g. a measure value temporarily, even if it is not stored in the data.

I see that OR ing HandleOrdinates with Ordinates.XY in the setter is a mistake.

roji commented

@roji is Postgis lenient to allow XY geometries in XYZ/M columns?

PostGIS allows columns to be defined in two ways:

  • GEOMETRY (or GEOGRAPHY), which is totally polymorphic: you can put in a point, a linestring, XY, XYZ, whatever.
  • GEOMETRY (POINT), which is constrained to a 2D point - it will reject both non-point geometries and points with non-XY ordinates.

Is there a use case for deliberatly mixing XY/XYZ/XYM/XYZM geometries in one column?

I'm not sure there is, but the point about PostGisWriter (and Reader) is that it should be possible to share them across multiple columns, and it seems much more acceptable to mix XY and XYZ within the same database. At least at the moment, Npgsql instantiates one writer and reader for a physical connection at the database (it may even use single instances for all connections in the future); it would be problematic (and odd) to have to instantiate a writer for different columns because their ordinates are different.

My use case for having the reader return exactly the ordinates specified in HandleOrdinates is that one might need e.g. a measure value temporarily, even if it is not stored in the data.

Understood, but if one wants to convert an XY point stored in the database to an XYZ point for any reason, that's very easy to do by the user and doesn't seem like it should be the responsibility of the reader - it seems to me that the latter should simply read the geometry object as it is defined in the database.

Note that there are two separate questions here:

  • Supporting a mode where readers and writers simply pass along data, without any truncating or addition (i.e. if the user specifies Ordinates.None). To me this seems like a pretty clear need which isn't met today, and should also be the default, since neither the reader/writer nor NTS in general can assume any specific ordinates that the user needs in their application.
  • Assuming the user does set a specific ordinates (e.g. XYZ), what should be done if they later supply an incomplete object with regards to what they specified (e.g. an XY point). The current behavior is to silently complete the object (setting Z to zero), which we could maintain, but I'd argue that sending the object as-is is more correct. The latter behavior would be more useful for the polymorphic PostGis type (GEOMETRY), the choice of zero seems arbitrary, etc. If the server is set up to reject non-XYZ points, then an exception seems like the desirable outcome.
roji commented

I see that OR ing HandleOrdinates with Ordinates.XY in the setter is a mistake.

Yeah, I understand what you mean. Though it's worth mentioning that assuming Ordinates.None is redefined to mean "infer from the object" (like #5 proposes to do), then the OR'ing with XY doesn't really do anything anymore - I don't think there's any value aside from None that doesn't contain XY anyway. But this could definitely be removed regardless.

peetw commented

Supporting a mode where readers and writers simply pass along data, without any truncating or addition (i.e. if the user specifies Ordinates.None). To me this seems like a pretty clear need which isn't met today, and should also be the default, since neither the reader/writer nor NTS in general can assume any specific ordinates that the user needs in their application.

This would be useful (although not essential) for us when converting to/from geometries in NHibernate.Spatial.PostGis.

roji commented

ping

Sorry for the delay. Having thought about it over for a while, I'd propose the follwing:

  • Reinterpret Ordinates.None to a meaning of InferFrom[Object|Stream|Input]. We can think about relabeling it as well but I think it is defined in GeoAPI which makes it a bit heavy.
  • Remove ORing value with Ordinates.XY in the HandleOrdinates setter.
  • If HandleOrdiantes has been set to any other value than Ordinates.None the reader and writer should adhere to this setting. I'm not convinced that throwing an exception is better than setting a default value, but that should not be a stopper.
roji commented

Thanks for the response @fobermeier, I'll submit a PR on this soon. I can also look at the other IO projects after this one.