Polygon centroid precision issue without GEOS
mstenta opened this issue · 14 comments
Without GEOS, I'm finding that the Polygon centroid() method is not giving me correct results. For example, given the following polygon WKT:
POLYGON ((-72.200767850924 41.765288098486, -72.200742369939 41.765238083337, -72.200767850924 41.765257089098, -72.200783944178 41.765279095762, -72.200767850924 41.765288098486))
Plugging that into GeoPHP (note that I am using the Drupal GeoPHP module):
geophp_load();
$wkt = 'POLYGON ((-72.200767850924 41.765288098486, -72.200742369939 41.765238083337, -72.200767850924 41.765257089098, -72.200783944178 41.765279095762, -72.200767850924 41.765288098486))';
$geometry = geoPHP::load($wkt);
$centroid = $geometry->centroid();
Running that code without GEOS gives the following centroid coordinates (NOT CORRECT):
-72.2089388822
41.769994803469
But running it WITH GEOS, I get these points instead (CORRECT):
-72.20076472168
41.765266382233
FYI I am testing this by toggling GEOS by simply adding return FALSE;
to the beginning of geoPHP::geosInstalled() - to trick it into thinking GEOS is not installed.
I'm not sure if this is a known issue - I haven't found it posted anywhere. Has anyone else come across it?
Could it have something to do with PHP float precision operations? Should we be using BCMath in the non-GEOS code to avoid floating errors? http://php.net/manual/en/language.types.float.php
I forked the repo to http://github.com/mstenta/geoPHP and started a new branch called "bcmath" - with the intention of providing BCMath operations for all arithmetic in the library.
I started with a single commit that uses BCMath operations (where available) in the Polygon area() and centroid() methods arithmetic.
With BCMath, the code in the issue summary above returns the following centroid points:
-72.200767850924
41.765288098486
MUCH closer to the point returned with GEOS - but still not exact. At least it's inside the polygon now...
What do you think? Do you want me to continue with this BCMath branch and add it for any/all arithmetic in the GeoPHP library? I'll clean up how it's done (perhaps by moving the common code into new private methods).
Here is a quick link to the comparison of my new bcmath branch to master: master...mstenta:bcmath
I expanded my "bcmath" branch: all arithmetic in Polygon and LineString are now covered - as far as I can tell those are the only ones that use arithmetic. Are there others I might have missed?
Also: I added a new helper function to determine if BCMath is installed: geoPHP::bcmathInstalled()
See the full branch comparison here: master...mstenta:bcmath
Through further testing I determined that I needed to set the scale of BCMath operations using bcscale().
Without it, arithmetic using latitude/longitude (that have a very high decimal precision) wasn't working. For example, Polygon::area() was returning 0 when it shouldn't have been.
I tested with various polygons and multipoints and settled on a precsion of 24. It seems to depend greatly on the exact point coordinates being used - but 24 seems to be a safe bet. I found that for some points, a precision of 12 was sufficient, but others needed 16 - even though they both had the same number of digits after the decimal place (12). So a scale of 24 should work for all cases.
I added the bcscale(24); function call to the geoPHP::bcmathInstalled() function so that it gets applied globally.
Thinking a bit more about that... is calling bcscale() in the geoPHP class too invasive? That will set the scale of all BCMath functions globally for the rest of the PHP page request. So any other code that uses BCMath could be affected. Maybe we should explicitly define the scale in each BCMath function call?
I discovered an issue in the LineString::greatCircleLength() and LineString::haversineLength() methods, using bcmath.
I'll need to do some more testing of those methods...
Btw, I'm working on a better centroid calculation supporting polygon holes, and following OGC' weighted average standard (0D - number of points, 1D - length of segments, 2D - area of components). Its almost ready, I only have trouble with "GeometryCollection inside GeometryCollection" case.
@BathoryPeter Any update on this?
P.S. Library helped a lot, thanks!
Sorry for stupidity...not really working with git a lot...is it in master branch already? I last downloaded this archive: https://phayes.github.io/bin/current/geoPHP/geoPHP.tar.gz
Unfortunately, not yet.
Its best to download from my fork: https://github.com/funiq/geoPHP
(please note that it uses PSR-4 namespaces so you will need an autoloader)
I'm totally new to the types of calculations you make, so simply maybe maybe you dont mind to answer one more question (just for my understanding)
I have a multypoligon - it is square and with a square hole inside... Where will be the centroid of this multypolygon according to your calculations?
Know it is probably obvious for you, but breaks my mind for now)))
Mybe inside the hole, but Centroid doesn't garantee anything about that. Look at Wikipedia or Postgis docs:
http://postgis.net/docs/ST_Centroid.html
An other geospatial method might be useful for you, named Point on surface (but its not implemented in GeoPHP yet)
We found a bug in my changes that came about due to a change in the way bcmod()
works in PHP 7.2+: https://www.drupal.org/project/geophp/issues/3050510
I rebased my branch onto latest master and added a commit that fixes the issue.
Thinking a bit more about that... is calling bcscale() in the geoPHP class too invasive? That will set the scale of all BCMath functions globally for the rest of the PHP page request. So any other code that uses BCMath could be affected.
I also reverted the commit that sets bcscale(24)
globally. This is absolutely too invasive. Not all GeoPHP users will be working with small latitude/longitude geometries. A better approach is for downstream code to set/reset the scale themselves if they need to. I am updating farmOS geometry code to do this instead. We need that because we are working with field/bed boundaries on farms which can potentially be very small, so high precision is needed.
The PR is here: #115 - but it's still not quite ready, per my comment above:
I discovered an issue in the LineString::greatCircleLength() and LineString::haversineLength() methods, using bcmath.