Esri/lerc

Different result if raster is 2D or 3D

bretttully opened this issue · 10 comments

I am writing a python wrapper that round-trips a numpy array through encode and decode. This wrapper works on both 2D and 3D numpy arrays, however, I get a surprising result when I compare the post-LERC results of individual bands.

I expected that if I encode and then decode a 256x256x2 array and do the same for each of the bands and stack them together that I should get identical results. However, this is not the case.

Is this incorrect?

image

Python test

This test fails on the last assert.

class TestLercCompressor:

    @classmethod
    def _zerror(cls, dtype: np.dtype):
        zerror = 5.0  # in unit8 space
        if np.issubdtype(dtype, np.integer):
            return dtype(zerror * (np.iinfo(dtype).max / np.iinfo(np.uint8).max))
        else:
            return dtype((zerror + 0.5) / (np.iinfo(np.uint8).max + 1))

    @pytest.mark.parametrize("dtype", [np.uint8, np.uint16, np.float32, np.double])
    def test_compress_3d(self, request, test_data_dir: Path, dtype: np.dtype):
        testname = f"{self.__class__.__name__}.{request.node.name}"
        rs = np.random.RandomState(sum(map(ord, testname)))
        if np.issubdtype(dtype, np.integer):
            raster = rs.randint(np.iinfo(dtype).min, np.iinfo(dtype).max, size=(256, 256, 2), dtype=dtype)
        else:
            raster = rs.uniform(0, 1, size=(256, 256, 2)).astype(dtype)

        z_error = self._zerror(dtype)

        raster2d = np.stack(
            [
                lerc.apply(raster=raster[..., 0], max_z_error=z_error),
                lerc.apply(raster=raster[..., 1], max_z_error=z_error),
            ],
            axis=2,
        )
        raster3d = lerc.apply(raster=raster, max_z_error=z_error)

        assert raster2d.dtype == raster.dtype
        np.testing.assert_raises(AssertionError, np.testing.assert_equal, raster, raster2d)
        np.testing.assert_allclose(raster, raster2d, rtol=0.0, atol=z_error)  # check raster2d changed
        if np.issubdtype(dtype, np.integer):
            np.testing.assert_array_less(raster2d, np.iinfo(dtype).max)
        else:
            np.testing.assert_array_less(raster2d, 1.0)

        assert raster3d.dtype == raster.dtype
        np.testing.assert_raises(AssertionError, np.testing.assert_equal, raster, raster3d)
        np.testing.assert_allclose(raster, raster3d, rtol=0.0, atol=z_error)  # check raster3d changed
        if np.issubdtype(dtype, np.integer):
            np.testing.assert_array_less(raster3d, np.iinfo(dtype).max)
        else:
            np.testing.assert_array_less(raster3d, 1.0)

        # there should be no difference between 2d and 3d
        np.testing.assert_equal(raster2d, raster3d)

C++ wrapping code using pybind11

#include "wrapLerc.hpp"

#include <pybind11/numpy.h>
#include <pybind11/stl_bind.h>

#include <LercLib/Defines.h>
#include <LercLib/Lerc.h>
#include <LercLib/Lerc_types.h>

namespace py = pybind11;

namespace {
static constexpr int LERC_VERSION = -1;  // use the latest

template <typename DataT> struct NumpyInfo {
    using NumpyT = typename py::array_t<DataT, py::array::c_style>;

    NumpyInfo() : array(), info(), data(nullptr), nDim(1), nCols(0), nRows(0), nBands(0) {}

    NumpyInfo(NumpyT inArray)
        : array(inArray), info(array.request()), data(nullptr), nDim(1), nCols(0), nRows(0), nBands(0) {
        if (info.ndim != 2 && info.ndim != 3) {
            throw std::runtime_error("LERC compression requires a 2D or 3D array");
        }
        if (info.ptr == nullptr) {
            throw std::runtime_error("The array has no data!");
        }
        data = static_cast<DataT *>(info.ptr);
        nCols = info.shape[0];
        nRows = info.shape[1];
        nBands = (info.ndim == 2) ? 1 : info.shape[2];
        if (array.size() != nCols * nRows * nBands) {
            throw std::runtime_error("Failed to correctly extract array size");
        }
    }

    NumpyInfo(const NumpyInfo &other) : NumpyInfo(std::move(NumpyT(other.info.shape, other.info.strides))) {
        // copy data into the array
        std::copy(other.data, other.data + other.array.size(), data);
    }

    NumpyInfo copy() const { return NumpyInfo(*this); }

    /// numpy array
    NumpyT array;
    /// numpy buffer protocol
    py::buffer_info info;
    /// Raw image data, row by row, band by band. This is a raw pointer, assume someone else is looking after memory
    DataT *data;
    /// number of values per pixel (for us, this is always 1)
    const int nDim;
    /// data shape
    int nCols, nRows, nBands;
};

/// Calculate the size of the required buffer size
template <typename DataT> struct LercBufferSize {
    using RasterInfo = NumpyInfo<DataT>;
    using RasterT = typename RasterInfo::NumpyT;

    static uint32_t calculate(const RasterT &raw, double maxZError) {
        RasterInfo raster(raw);
        return calculateImpl(raster, maxZError);
    }

    static uint32_t calculateImpl(const RasterInfo &raster, double maxZError) {
        LercNS::BitMask *bitMask = nullptr;  // We don't care about the mask
        uint32_t bs = 0;                     // size of outgoing Lerc blob
        auto status = LercNS::Lerc::ComputeCompressedSizeTempl<DataT>(
            raster.data, LERC_VERSION, raster.nDim, raster.nCols, raster.nRows, raster.nBands, bitMask, maxZError, bs);
        if (status != LercNS::ErrCode::Ok) {
            throw std::runtime_error("Failed to fetch the number of bytes required for the buffer");
        }
        return bs;
    }
};

template <typename DataT> struct LercCompressor {
    using RasterInfo = NumpyInfo<DataT>;
    using RasterT = typename RasterInfo::NumpyT;

    static void applyInPlace(RasterT &raw, double maxZError) {
        RasterInfo raster(raw);
        impl(raster, maxZError);
    }

    static RasterT apply(const RasterT &raw, double maxZError) {
        auto raster = RasterInfo(raw).copy();
        impl(raster, maxZError);
        return raster.array;
    }

  private:
    static void impl(RasterInfo &raster, double maxZError) {
        LercNS::BitMask *bitMask = nullptr;  // We don't care about the mask

        // --- get the buffer size
        uint32_t numBytesNeeded = LercBufferSize<DataT>::calculateImpl(raster, maxZError);

        // --- encode
        // buffer to write to, function will fail if buffer too small
        std::vector<LercNS::Byte> lercBlob(numBytesNeeded, 0);
        auto pLercBlob = lercBlob.data();

        uint32_t numBytesUsed = 0;  // num bytes written to buffer
        auto status = LercNS::Lerc::EncodeTempl<DataT>(raster.data, LERC_VERSION, raster.nDim, raster.nCols,
                                                       raster.nRows, raster.nBands, bitMask, maxZError, pLercBlob,
                                                       numBytesNeeded, numBytesUsed);
        if (status != LercNS::ErrCode::Ok) {
            throw std::runtime_error("Failed to encode data");
        }

        // --- decode
        status = LercNS::Lerc::DecodeTempl<DataT>(raster.data, pLercBlob, numBytesUsed, raster.nDim, raster.nCols,
                                                  raster.nRows, raster.nBands, bitMask);
        if (status != LercNS::ErrCode::Ok) {
            throw std::runtime_error("Failed to decode data");
        }
    }
};

template <typename DataT> void wrapLercImpl(py::module &m) {
    using LBS = LercBufferSize<DataT>;
    m.def("buffer_size", &LBS::calculate, py::arg("raster"), py::arg("max_z_error"));

    using LC = LercCompressor<DataT>;
    m.def("apply_in_place", &LC::applyInPlace, py::arg("raster"), py::arg("max_z_error"));
    m.def("apply", &LC::apply, py::arg("raster"), py::arg("max_z_error"), py::return_value_policy::move);
}
}  // namespace

namespace cppaiutils {
namespace python {

void wrapLerc(py::module &m) {
    auto mLerc = m.def_submodule("lerc");
    wrapLercImpl<uint8_t>(mLerc);
    wrapLercImpl<uint16_t>(mLerc);
    wrapLercImpl<float>(mLerc);
    wrapLercImpl<double>(mLerc);
}
}  // namespace python
}  // namespace cppaiutils

Indeed, this is expected for lossy compression (zerror != 0).
LERC compression is based on block quantization, with every block having it's own reference, usually the local minimum. The 3D block will have a slightly different reference than the 2D ones because the local minimum is different. You clearly see the blocks in the difference image.
You will also see differences when shifting the array, because the block alignment is different.

@bretttully
Repeated lossy LERC compression with different block alignment will have a cumulative effect on the error. For example, using precision 5 and compressing data twice with a different alignment (or once 2D and once 3D) will result in a max error of 10 (5 + 5) for some of the values.
If this is not acceptable, quantize the data externally in some repeatable way, before using lossless integer LERC.

@bretttully Try with zerror = 0, lossless, and see if it passes your test.

@lucianpls For repeated lossy compression with different alignment the overall compression error will accumulate. That is always true, not only for Lerc (e.g., also for jpeg). If you know the max number of lossy encodes, you can pick your max encode error per encode accordingly (e.g., 0.001 per encode to end up with an overall enc error of 0.01 after up to 10 lossy encode calls on the same area). If you don't know that max number of lossy encode calls, you better go with lossless and set the encode error to 0.

Thanks for the feedback all.

@tmaurer3 running the test with zerror = 0 passes; but I think I have found the issue and it is slightly more subtle, but also obvious... The numpy array needs to be ordered correctly.

Reviewing the algorithm when nBands > 1 LERC expects the bands to be the slowest moving index, followed by the rows. So the numpy array needs to be set up with a size of (2, 256, 256) and nBands, nRows, nCols = arr.shape. Implementing this and the test pass.

Does this make sense?

@tmaurer3 Out of curiosity, are there elements of the LERC specification that do not treat values in a third dimension as part of the same 'block' of values? That is, quantise blocks within the 2D matrices of the third dimensions independently of each other. For instance, I notice that LERC distinguishes between nBands and nValuesPerPixel. Does the LERC spec treat these differently?

@bretttully
Interesting, that test would only pass if the 2D bands in a 3D block are quantized using separate references. @tmaurer3, can you please provide more details?

From what I can tell, each 2D band is encoded independently inside this loop https://github.com/Esri/lerc/blob/master/src/LercLib/Lerc.cpp#L322

and line 325 of Lerc.cpp calculates an offset into the data array based off the assumption of data layout in the flat array.

@adriancaruana @bretttully @lucianpls There might be a misunderstanding here about the meaning of the "nDim" parameter. See also the section in the Readme underneath the C API. "About multiple bands ... ". "nDim" = number of values per pixel if you store a 1 dim array of values at each pixel. This is in addition to nBands. So the total array is like this:
[nBands, nRows, nCols, nDim]. I have to admit now that "nDim" is not a good choice for the name, but it is shorter than "nValuesPerPixel". And yes, they are treated differently. Different bands are encoded independently. This is first historical, second still good in case you know that different bands don't have much correlation. In other words, Lerc supports up to 4 dimensions. Simple example for 3D, let's say RGB image. You can either say [nBands = 3, nRows = 256, nCols = 256, nDim = 1]. Then Lerc encodes the 3 bands R, G, and B separately, and only concatenates the binary blobs. Or you say [nBands = 1, nRows = 256, nCols = 256, nDim = 3]. Then the 3 RGB values are always together. In this case Lerc tries to make use of the correlation of the R, G, and B bands. It tries encoding the G values relative to the R values, and pick the better one, relative or absolute encoding (for 8x8 blocks, and per block). Depending on how correlated they are, the second way to call Lerc can give better compression.
You can also have an array of 10 values per pixel, then you set nDim = 10. Perhaps we should have called it "nDepth".

@tmaurer3 perfect, that matches my (new) understanding of how LERC works. In which case, this is not an issue with LERC and we can close the issue.

That is a bit confusing! Intuitively, I would assume the bands change the fastest (interleaved) and the extra dimension the slowest. Either way is fine, if the convention is known.