cogeotiff/rio-tiler

Forwarding `ImageData.array.mask` in NPZ output

Closed this issue · 4 comments

In this line,

self.mask, # We use dataset mask for rendering
should it be using self.array.mask instead of self.mask? My understanding is that it's not the intention to be using uint8 for writing the mask

should it be using self.array.mask

No, ImageData.array.mask is the numpy array boolean mask, and is of shape similar to the data array itself (e.g multiple bands) which is why we use ImageData.mask which is a proper alpha band compatible with rasterio/gdal image encoding

@property
def mask(self) -> numpy.ndarray:
"""Return Mask in form of rasterio dataset mask."""
return numpy.logical_or.reduce(~self.array.mask) * numpy.uint8(255)

Hmmm, it's true for the rasterio/gdal case, but if we are saving as npz format, there isn't really a required format right? Would it make sense to just use numpy's masked array mask format?

Ah yes, for NPZ we could save the whole numpy mask array. This might complexity the workflow and create some kind of confusion (having multiple mask/alpha types).

Is there any specific reason why a user would want the numpy masked array (ImageData.array.mask) instead of the alpha band (ImageData.mask)?

Our use case is to save to file and read back in as numpy format, effectively doing just np.savez_compressed(f, data=img.array, mask=img.array.mask). I just need the actual data and mask to make sure the downstream calculations is aware of what's masked.

If it adds too much complexity, I can always just call the np.savez_compressed directly - no worries about accommodating this particular use case.