fiji/SPIM_Registration

Is compression in HDF5 disabled when I set "no compression"?

emmenlau opened this issue · 7 comments

I have re-saved several datasets as HDF5. I set the compression to "No". However, the file is after re-saving only 60% of the size of the original file. That seems unlikely, since the HDF5 has additionally multiple resolution levels, so it should be bigger or equal size?!

I can not install HDF View to inspect the file to check myself, because HDF View requires admin permissions on Windows. Sorry. Here an excerpt from the logs, if that helps:

Maximum number of pixels in any view: n=3240345600 (2^31 < n < 2^32 px), using CellImg(256).
Minimal resolution in all dimensions is: 0.2893347442150116
(The smallest resolution in any dimension; the distance between two pixels in the output image will be that wide)
(Sat Aug 08 17:51:50 GMT+02:00 2015): Saved xml 'D:\UserTemp\MarioEmmenlauer\BeadBasedMultivewTest\dataset.xml'.
Resaving 80 views As HDF5 ...
Saved XML 'D:/UserTemp/MarioEmmenlauer/BeadBasedMultivewTest/dataset.xml'.
HDF5 file: D:\UserTemp\MarioEmmenlauer\BeadBasedMultivewTest\dataset.h5
proccessing timepoint 1 / 1
proccessing setup 1 / 80
Investigating file 'D:\UserTemp\MarioEmmenlauer\BeadBasedMultivewTest\manual_multiview20x_beads0.52um1zu12000_ fish.czi'.
Sat Aug 08 17:52:24 GMT+02:00 2015: Opening 'manual_multiview20x_beads0.52um1zu12000_ fish.czi' [1920x1920x879 angle=0 ch=561 illum=0 tp=0 type=uint16 img=CellImg<UnsignedShortType>]
writing level 0
writing level 1
writing level 2
writing level 3
proccessing setup 2 / 80
...
�```

Not necessarily, if there is a lot of black or the dynamic range of 16bit is not used for a block there is some automatic way to save space. I guess that is the case here ...

AFAIK HDF5 will only save space if

  • a chunk/block is never written to (not likely here)
  • bit depth is converted to a lower bit depth (is this happening here? I did not enable it)
  • compression is enabled (which I did not do)

In all other cases, I am not aware of other automatic "magic" to save space (which in any case would be a kind of compression :-) ). Or am I wrong/confused?

just imagine: it is a 16bit container, i.e. 0...65535, but if the intensities only range from 0....12000 or so it will not use 16 bit but less. It always does that by default as you do not loose anything, at least @tpietzsch told me that.

Hmmm, this must be a functionality that was unknown to me. So by knowing the max intensity, you set the bit range to an arbitrary number like 15 or 14 bit, yes? I can see how that may work, and AFAIK JPEG2000 lossless can do the same. But, in order to work correctly, this requires implementation effort, and I have not seen this functionality before, so I would be curious to re-confirm this fact.

Hi,

The compression flag when saving to HDF5 switches DEFLATE compression on or off.
Regardless of this flag, always this filter is used: https://www.hdfgroup.org/HDF5/doc/UG/10_Datasets.html#ScaleOffset
So: yes, some form of compression is enabled also if the compression flag in the resave dialog is off.

Because we are using the JHDF5 wrapper library (where some low-level details are hidden), I was a bit ignorant about the fact that this is actually a filter instead of a feature that datasets "just have”.

best regards,
Tobias

On 10 Aug 2015, at 12:13, Mario Emmenlauer notifications@github.com wrote:

Hmmm, this must be a functionality that was unknown to me. So by knowing the max intensity, you set the bit range to an arbitrary number like 15 or 14 bit, yes? I can see how that may work, and AFAIK JPEG2000 lossless can do the same. But, in order to work correctly, this requires implementation effort, and I have not seen this functionality before, so I would be curious to re-confirm this fact.


Reply to this email directly or view it on GitHub.

Dear Tobias

what you write makes perfect sense and is very useful to know!
I did not know this filter, but it clearly is very helpful and
effective for this use case. Thanks for pointing this out!

All the best,

Mario