davenquinn/mars-tiler

Tile reads are extremely slow even with testing datasets

Closed this issue · 8 comments

Slow tile reads

Reading COG data always takes > 1s no matter the input dataset. At first I assumed this was just the cost of reprojecting a large raster dataset (most images used by this library are reprojected on the fly from equirectangular to web mercator). However, it seems that these slow speeds persist even when using the test fixtures bundled with this library, which are windowed and downscaled to <100kb per file. Clearly data volume or network transfer is not the issue here.

Digging a bit, this could be related to some previous issues with GDAL reprojection encountered by rio-tiler (cogeotiff/rio-tiler#346). Some slow speeds identified by @kylebarron, relating to network loading of data, might also be related (https://rasterio.groups.io/g/main/topic/72528118#468). It seems that there was a problem with RasterIO wheels with GDAL < 3.3 (although the problem might still exist in some reduced form).

It's unclear how to solve this at present.

Environment

We are running rio-tiler v3, rasterio 1.2.10 (built as a wheel) and gdal 3.4.0.

A subset of relevant environment variables:

GDAL_CACHEMAX=200
GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
GDAL_HTTP_MULTIPLEX=YES
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".tif,.TIF,.tiff"
VSI_CACHE=TRUE
VSI_CACHE_SIZE=5000000
GDAL_HTTP_VERSION=2
PROJ_NETWORK=OFF

Profiling

We profiled reading each dataset as a warped VRT; roughly:

for file in datasets:
    with rasterio.open(file) as src:
        with WarpedVRT(src, crs=MARS_MERCATOR) as vrt:
             vrt.read(1)

The results suggests that the problem might not be WarpedVRT loading. But it does seem that a long time is taken parsing projection information. Perhaps pre-loading projections can be a speedup...

         12139 function calls (11035 primitive calls) in 0.929 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.929    0.929 {built-in method builtins.exec}
        1    0.000    0.000    0.929    0.929 <string>:1(<module>)
        1    0.192    0.192    0.929    0.929 load-datasets:14(open_all_vrts)
        8    0.000    0.000    0.356    0.044 _collections_abc.py:760(get)
        8    0.000    0.000    0.356    0.044 crs.py:73(__getitem__)
        8    0.000    0.000    0.356    0.044 crs.py:195(data)
        8    0.000    0.000    0.356    0.044 crs.py:170(to_dict)
        8    0.000    0.000    0.355    0.044 crs.py:146(to_epsg)
        8    0.354    0.044    0.355    0.044 {method 'to_epsg' of 'rasterio._crs._CRS' objects}
        8    0.000    0.000    0.308    0.038 crs.py:283(to_string)
        8    0.000    0.000    0.305    0.038 crs.py:158(to_authority)
        8    0.305    0.038    0.305    0.038 {method 'to_authority' of 'rasterio._crs._CRS' objects}
        4    0.000    0.000    0.054    0.013 env.py:416(wrapper)
        4    0.022    0.006    0.048    0.012 __init__.py:55(open)
       12    0.000    0.000    0.025    0.002 crs.py:444(from_wkt)
       12    0.025    0.002    0.025    0.002 {rasterio._crs.from_wkt}
        4    0.009    0.002    0.010    0.003 {method 'read' of 'rasterio._warp.WarpedVRTReaderBase' objects}
        8    0.000    0.000    0.008    0.001 env.py:257(__enter__)
        8    0.000    0.000    0.007    0.001 env.py:302(defenv)
      202    0.000    0.000    0.007    0.000 __init__.py:1424(debug)
        8    0.005    0.001    0.007    0.001 {method 'start' of 'rasterio._env.GDALEnv' objects}

@vincentsarago any ideas on this one? I'm not sure if there are still similar issues upstream in rio-tiler. Using less-common Mars projections may be a culprit here.

We are running rio-tiler v3, rasterio 1.2.10 (built as a wheel) and gdal 3.4.0.

if you are using rasterio wheels, you are not using gdal 3.4.0 but the one used to create rasterio wheels (should be 3.3 I think)

Quick look at your profiling results, it seems most of the slow operation are PROJ related (to_authority / to_epsg)

could you try debugging with CPL_TIMESTAMP=ON CPL_DEBUG=ON PROJ_DEBUG=3 OSGeo/gdal#3470 (comment)

Ah yes, my mistake. These weren't running with wheels after all but using my homebrew-installed GDAL (3.4.0), proj (7.2.1) etc. It seems that there are a lot of failed attempts to open a nonexistent EPSG library. Debugging with your suggestions yields:

image

I've also been running under docker using rasterio wheels (1.2.10, GDAL 3.3.2). Proj has fewer errors than running with the local version (it is not missing EPSG libraries) but the tests take even longer, with similar profiling results.

It looks that RasterIOs CRS.to_authority and CRS.to_epsg functions are the biggest culprits here. Do you know of a quick way to pre-bake RasterIO CRS objects from Proj projections? I will experiment with that and a LRU cache as first options. I think the code for those areas of RasterIO is cython, which I am less adept at debugging...

A few entries from profiling under docker:

8    0.730    0.091    0.738    0.092 {method 'to_authority' of 'rasterio._crs._CRS' objects}
        8    0.000    0.000    0.727    0.091 _collections_abc.py:760(get)
        8    0.000    0.000    0.727    0.091 crs.py:73(__getitem__)
        8    0.000    0.000    0.727    0.091 crs.py:195(data)
        8    0.000    0.000    0.727    0.091 crs.py:170(to_dict)
        8    0.000    0.000    0.726    0.091 crs.py:146(to_epsg)
        8    0.718    0.090    0.726    0.091 {method 'to_epsg' of 'rasterio._crs._CRS' objects}

Do you know of a quick way to pre-bake RasterIO CRS objects from Proj projections?

https://github.com/developmentseed/morecantile/blob/master/morecantile/models.py#L185-L195

if you provide the CRS your are using I could look on my side

All TMS caclulations use the MARS_MERCATOR crs:

MARS_MERCATOR = CRS.from_dict({"proj": "merc", "R": mars_radius, "no_defs": True})
.

Datasets are either in MARS2000, MARS200_SPHERE or MARS_EQC (also defined in same file).

A few additional aspects here:

  1. A NotGeoreferencedWarning is being thrown for each dataset, despite all datasets having CRS info (not GCPs). I'm having trouble breakpointing this line of code in Rasterio; I assume this comes from somewhere on the cpython/GDAL binding side.
NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix be returned.
    return writer(mempath, 'w+', driver=driver, width=width,
  1. I somewhat reworked the Morecantile TMS class to cache projection information and it did not seem to help much.
    @property
    def rasterio_crs(self):
        """Return rasterio CRS."""

        import rasterio
        from rasterio.env import GDALVersion

        if self._rio_crs is not None:
            return self._rio_crs

        if GDALVersion.runtime().major < 3:
            self._rio_crs = rasterio.crs.CRS.from_wkt(
                self.crs.to_wkt(WktVersion.WKT1_GDAL)
            )
        else:
            self._rio_crs = rasterio.crs.CRS.from_wkt(self.crs.to_wkt())
        return self._rio_crs

Aha, now we are getting somewhere. I created a new block of tests (

def test_crs_transformation_speed():
) that suggests that to_epsg and to_authority are being called somewhere internally in RasterIO (not in the CRS constructor) and taking a huge amount of time for these projections. It is probably paging through a list of Earth projections and recovering from internal errors. I'll now start on searching for where these functions are called.

image

OK @vincentsarago, I think I figured out the source of (at least a major part of) the problem. Internally in RasterIO, the to_authority and to_epsg calls are extremely costly for Mars CRS definitions, and are run every time CRS.to_string is called. By overriding these methods, I can achieve a 65x speedup (at least in the tests).

I am not sure how easy it will be to override all places where these CRS objects are constructed. This may only be resolvable with an upstream issue in RasterIO.

However, I think there remains a problem somewhere in the stack of these CRS transformation/initialization methods being called much more often than they need to be (presumably this would only need to happen once per TMS/dataset).

For reference, here is the overridden CRS object:

class MarsCRS(rasterio.crs.CRS):
    def to_epsg(self):
        return None

    def to_authority(self):
        return "IAU"