image-rs/image-tiff

Tile-Based images can not be read

Jasper-Bekkers opened this issue · 9 comments

It looks like at the moment the tiff crate can't read a large amount of valid tiff files - specifically any geo data that I've found. I've attached a small sample file since it's unclear to me how to fix this at the moment without going deep into the tiff file format (or the image-tiff codebase).

ASTGTMV003_N58W128_num.zip

That's a tiled tiff. We only support strip based files at the moment.

tiffinfo output
TIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34736 (0x87b0) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 42112 (0xa480) encountered.
TIFF Directory at offset 0x8 (8)
  Image Width: 3601 Image Length: 3601
  Tile Width: 256 Tile Length: 256
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Tag 33550: 0.000278,0.000278,0.000000
  Tag 33922: 0.000000,0.000000,0.000000,-128.000139,59.000139,0.000000
  Tag 34735: 1,1,0,7,1024,0,1,2,1025,0,1,1,2048,0,1,4326,2049,34737,7,0,2054,0,1,9102,2057,34736,1,1,2059,34736,1,0
  Tag 34736: 298.257224,6378137.000000
  Tag 34737: WGS 84|
  Tag 42112: <GDALMetadata>
  <Item name="Band_1">Band 1</Item>
  <Item name="DESCRIPTION" sample="0" role="description">Band 1</Item>
</GDALMetadata>

  Predictor: none 1 (0x1)
TIFF Directory at offset 0x926 (2342)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1801 Image Length: 1801
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x10dc (4316)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1201 Image Length: 1201
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x14aa (5290)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 901 Image Length: 901
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1758 (5976)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 451 Image Length: 451
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1886 (6278)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 401 Image Length: 401
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x19b4 (6580)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 226 Image Length: 226
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1a82 (6786)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 57 Image Length: 57
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1b30 (6960)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 45 Image Length: 45
  Tile Width: 128 Tile Length: 128
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)

Alright - would you like this issue to stay open as a feature request, or close it as "as designed"?

Hey there !
I've heard that the image-crate needs contributors, especially the tiff crate. I don't know much about the format, but I read part of the spec today, and it's surprisingly comprehensible.

I'l see if I can do something about this issue :)

btw - Is there a Discord / Zulip chat for image-rs, or does all communication happen on GitHub ?

That's great to hear, looking forward to a PR. For discussions, you can join gitter/matrix:

https://gitter.im/image-rs/image
https://app.element.io/#/room/#image-rs_image:gitter.im

Tile decoding is mostly done I think.

The only thing I'd like to improve before closing this issue is incremental decoding (mainly making the code cleaner).

I'd also like to point out, that the current incremental API for Tiles doesn't work in the same way as the API for Strips.
The difference is that read_strip() and read_image() are dependent. If the user reads a few strips with read_strip(), then read_image() only decodes the rest of the image without the first few strips. Also, after read_strip() decodes the last strip, it loops back to the first one. I'm not sure if that's desirable.

The read_tile() and read_image() functions are independent. I didn't want to make them dependent because it's not clear what to do when the user calls read_image() and the amount of already decoded tiles with read_tile() != tiles_accross.

@HeroicKatora What I need to know is if it's ok that the behaviour is inconsistent and if not, which behaviour to choose.

Honestly, the dependence between read_strip() and read_image() is kind of ugly. I'd say leave things how they are for read_tile() and we can decide separately whether to change the behavior with strips.

From a high-level perspective I get the impression that we should prefer the behavior to read_tile in that it works exclusively with tile based image data but does not interact with any other reader method. This makes it a clear optimization where the caller can choose the strategy to read image data but they don't interfere. It would also simplify the respective implementations.

Okay, thanks :) That confirms what I was thinking.

This should be resolved now