Toblerity/rtree

Default index bounds are invalid, opposite interleave

adamjstewart opened this issue · 8 comments

I noticed the following behavior when trying to compute the number of entries in an empty index:

$ python
>>> from rtree.index import Index
>>> idx = Index(interleaved=True)
>>> idx.bounds
[1.7976931348623157e+308, 1.7976931348623157e+308, -1.7976931348623157e+308, -1.7976931348623157e+308]
>>> idx.count(idx.bounds)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/Adam/.spack/.spack-env/view/lib/python3.9/site-packages/rtree/index.py", line 518, in count
    p_mins, p_maxs = self.get_coordinate_pointers(coordinates)
  File "/Users/Adam/.spack/.spack-env/view/lib/python3.9/site-packages/rtree/index.py", line 357, in get_coordinate_pointers
    raise core.RTreeError(
rtree.exceptions.RTreeError: Coordinates must not have minimums more than maximums
>>> idx = Index(interleaved=False)
>>> idx.bounds
[1.7976931348623157e+308, -1.7976931348623157e+308, 1.7976931348623157e+308, -1.7976931348623157e+308]
>>> idx.count(idx.bounds)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/Adam/.spack/.spack-env/view/lib/python3.9/site-packages/rtree/index.py", line 518, in count
    p_mins, p_maxs = self.get_coordinate_pointers(coordinates)
  File "/Users/Adam/.spack/.spack-env/view/lib/python3.9/site-packages/rtree/index.py", line 357, in get_coordinate_pointers
    raise core.RTreeError(
rtree.exceptions.RTreeError: Coordinates must not have minimums more than maximums

The bounds of an empty index have the opposite interleave pattern from the index itself. This causes a number of issues in other parts of the code as well.

hobu commented

The bounds of an empty index have the opposite interleave pattern from the index itself.

would you be willing to provide a PR to address this?

Unfortunately I'm not sure where this problem arises from. This likely relates to #197 (comment).

I'm running into this bug in many other unexpected places. I think part of my confusion stems from the fact that bounds and bbox are apparently not the same thing? Index only has bounds, no bbox. And Item bounds and bbox always have the opposite interleave. Even more confusing, the interleave of Item has nothing to do with the interleave of Index?? For example:

$ python
>>> from rtree.index import Index

>>> idx = Index(interleaved=False)
>>> idx.insert(0, (3, 5, 2, 4))
>>> hit = list(idx.intersection(idx.bounds, objects=True))[0]
>>> hit.bounds
[3.0, 5.0, 2.0, 4.0]
>>> hit.bbox
[3.0, 2.0, 5.0, 4.0]

>>> idx = Index(interleaved=True)
>>> idx.insert(0, (3, 2, 5, 4))
>>> hit = list(idx.intersection(idx.bounds, objects=True))[0]
>>> hit.bounds
[3.0, 5.0, 2.0, 4.0]
>>> hit.bbox
[3.0, 2.0, 5.0, 4.0]

Personally, I would expect bounds and bbox to be the same thing, and for the interleave to match that of the bounds being added to the index. This is quite unintuitive and means that code that needs to work for both interleave=False and interleave=True needs to be duplicated (using bbox in one place and bounds in another).

hobu commented

interleaved was such a BAD idea, introduced by me, from a time before there were things like GeoJSON bbox, etc that would have defined a default.

If you have the time and enthusiasm, please rip it all out and no-op it. I think people would be excited to have __geo_interface__ support instead, with GeoJSON bbox as the default ordering for everything.

Good to know, we were using interleaved=True in TorchGeo because it's kind of nice for 3D rtrees but if it's discouraged and known to be buggy we'll prob switch to interleaved=False.

hobu commented

Maybe it is too late to change, and we must now suffer for this sin forever by fixing all of the issues it caused.

As far as I'm concerned, rtree is still 0.X, so you can keep changing the API and breaking backwards compatibility until a stable 1.0 release is out.

hobu commented

As far as I'm concerned, rtree is still 0.X, so you can keep changing the API and breaking backwards compatibility until a stable 1.0 release is out.

It is a ten year old library that's used all over the place. The stability has been the 🐢 releases, and our bad release naming combined with that is likely to startle people if we did actually start breaking lots of backwards compatibility stuff at this point.

My desire to fix the mistake aside, it's probably too late to remove interleaved as an option, and we should fix any obvious or identified issues caused by it.