johntruckenbrodt/pyroSAR

[Archive] use geometry instead of bounding box

johntruckenbrodt opened this issue · 6 comments

The archive class stores the bounding box of the scene in a database for spatial querying. Some while ago the SAR driver method geometry was introduced, which returns the footprint geometry instead of the bounding box. This is more accurate and should also replace the bounding box in the database for more refined search.
Recently, two PRs were created to solve this issue:

  • #185: add a second geometry column keeping the bounding box column
  • #282: first search by bounding box and then filter the returned scenes by intersection with the geometry

I think a more pragmatic approach would be to completely remove the column bbox in favor of a new column geometry. The recently introduced mechanism to load an Archive in legacy mode and importing its content into a new database (#260) could be used to migrate the database to the new layout.

With the merge of #288, all driver classes support the method geometry (by exposing an attribute self.meta['coordinates']).

I think there is an import for older data tables in #185. If this is still of interest, I can take a look!

Hi @MarkusZehner. Now that #288 is merged, we can finally implement the Archive geometry column. Sorry, it took quite a bit of time. I wanted to make sure everything gets tested and works well.
It would be great if you could have a look. I would like to enable the migration from a database with bbox column to one with a geometry column by creating a new database and importing the old one. Modifying the structure of an existing database is quite dangerous I think. The method Archive.import_outdated was made for this:

from pyroSAR import Archive
db_new = 'scenes.db'
db_old = 'scenes_old.db'
with Archive(db_new) as db:
    with Archive(db_old, legacy=True) as db_old:
        db.import_outdated(db_old)

Hi @johntruckenbrodt, sorry for the delay from my side. I'm just having a look at how to address this best. If you have any hints for me, I'd be glad to talk. Otherwise, I'll look into replacing the 'bbox' as 'geometry' column in the Archive setup and adapt the import_outdated to check for this column to differentiate between Archive versions.

Hi @MarkusZehner, thanks for getting back to this. No need to apologize, you're not obliged to do anything here. Yes, I would do it like you say. Now that all SAR format drivers have implemented the geometry method, the call in Archive.insert can be changed from bbox to geometry. Then, the column in the database can be renamed accordingly. Upon opening an outdated database with bbox as column, an error is thrown if not opened with legacy=True. If opened in legacy mode, the method import_outdated can create a new database and re-insert all scenes found in the old database. So basically select all file paths from the old database (column scene) and pass them as list to Archive.list in the new one. If you prefer we could also have a live chat after Easter.

Hi @johntruckenbrodt, the current #296 should do most of the above-mentioned. I added some checks to the Archive.__init__ and tests to cover the new import_outdated function. I hope this helps, and yes, maybe also Pizza instead of a live chat? ;)

maybe also Pizza instead of a live chat? ;)

Exactly what I had in mind 😁 🍕