developmentseed/geojson-pydantic

Rationale for not supporting empty geometries?

spolloni opened this issue · 7 comments

Hi there! Just wondering if it is a deliberate decision that empty geometries don't pass validation because these constrained lists:

# Coordinate arrays
MultiPointCoords = conlist(Position, min_items=1)
LineStringCoords = conlist(Position, min_items=2)
MultiLineStringCoords = conlist(LineStringCoords, min_items=1)
LinearRing = conlist(Position, min_items=4)
PolygonCoords = conlist(LinearRing, min_items=1)
MultiPolygonCoords = conlist(PolygonCoords, min_items=1)

e.g:

>>> from shapely.geometry import mapping, Polygon
>>> from geojson_pydantic import Polygon as PydanticPolygon
>>> PydanticPolygon.parse_obj(mapping(Polygon()))
pydantic.error_wrappers.ValidationError: 1 validation error for Polygon
coordinates
  ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)

Empty geometries seem to be permitted per the geojson spec

Here is what the spec says:

A GeoJSON Geometry object of any type other than
"GeometryCollection" has a member with the name "coordinates".
The value of the "coordinates" member is an array. The structure
of the elements in this array is determined by the type of
geometry. GeoJSON processors MAY interpret Geometry objects with
empty "coordinates" arrays as null objects.

It seems that maybe we should support empty [] coordinates 🤷‍♂️. the constrained list were introduced in #44 (ping @geospatial-jeff)

+1, the min_items constraints here don't match the official GeoJSON schema here:

This is unexpected/breaking for me moving to pydantic

I have been working on an update to address this. Unfortunately the WKT functions were based on the constraints and caused some issues. So I ended up down a bit of a rabbit hole.

🤔 with #100 I though we fixed this but correct me if I'm wrong @eseglem but we do not support empty coordinates array for all the geometry

Point(type="Point", coordinates=[])

ValidationError: 2 validation errors for Point
coordinates
  wrong tuple length 0, expected 2 (type=value_error.tuple.length; actual_length=0; expected_length=2)
coordinates
  wrong tuple length 0, expected 3 (type=value_error.tuple.length; actual_length=0; expected_length=3)
LineString(type="LineString", coordinates=[])
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 LineString(type="LineString", coordinates=[])

ValidationError: 1 validation error for LineString
coordinates
  ensure this value has at least 2 items (type=value_error.list.min_items; limit_value=2)
from geojson_pydantic import MultiPoint

MultiPoint(type="MultiPoint", coordinates=[])

@vincentsarago Correct. The spec says a point requires x and y. Whereas multi point is an array of points. So the array can be empty but a point can't. A point can still be null though. And a line string requires 2 points, but a multi line string is an array of line strings with no length requirement. It should all match up with the json schema linked above. I checked the schema and spec multiple times to make sure everything lined up.

thanks folks!