perrygeo/python-rasterstats

point_query on uint8 VRT returns strange float values

allixender opened this issue · 1 comments

Describe the bug

Source raster is a GDAL VRT:

/vsicurl/https://files.isric.org/soilgrids/latest/data/wrb/MostProbable.vrt

import rasterio as rio

with rio.open("/vsicurl/https://files.isric.org/soilgrids/latest/data/wrb/MostProbable.vrt") as src:
    display(src.profile)

# shows
>>> {'driver': 'VRT', 'dtype': 'uint8', 'nodata': 255.0, 'width': 172800, 'height': 67200, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(0.002083333000000875, 0.0, -180.0,
       0.0, -0.002083333000000875, 83.99916720600001), 'blockxsize': 128, 'blockysize': 128, 'tiled': True}

Running point_query on any pixel should return integers, or at least full numbers without a fraction. Afterall the raster is uint8. Even if point_query needs to coerce into float to capture nodata values, there shouldn't be any numbers that have other values then 0 as mantissa (after the comma). However, the are cases that seem to sample various real numbers (7.54356 etc).

It is unclear to me, if I did something wrong, or point_query.

To Reproduce
Steps to reproduce the behavior:

  1. How did you install rasterstats and its dependencies?

I am on Linux, EndeavourOS/ArchLinux, but I use Conda/Miniconda environment.

  • Python 3.10.6
  • gdal 3.4.1 py310h8172e47_5 conda-forge
  • rasterio 1.2.10 py310h5e0f756_4 conda-forge
  • rasterstats 0.16.0 pypi_0 pypi
  1. What datasets are necessary to reproduce the bug? Please provide links to example data if necessary.
  1. What code is necessary to reproduce the bug? Provide the code directly below or provide links to it.
from rasterstats import point_query
import geopandas as gpd
from shapely.geometry import Point

wrb_most_probable_class_vrt = "/vsicurl/https://files.isric.org/soilgrids/latest/data/wrb/MostProbable.vrt"

testgeo = gpd.GeoSeries([ Point(27.51906, 59.25818) ], crs=4326)

point_query(testgeo, wrb_most_probable_class_vrt)

# returns
>>> [7.75555534157003]

# should be
>>> [6]

(validated with QGIS 3.28)
Screenshot_2023-02-23_10-07-08_1

Running point_query on any pixel should return integers

The point query function uses interpolation="bilinear" by default. So the expected behavior is a weighted average of the nearby pixels, likely a float.

If you want the exact pixel value, you can pass interpolate="nearest"... but if that is all you need, this library might be overkill as Rasterio can read values directly.