Blosc/python-blosc2

"RuntimeError: Cannot decompress" for a compressed sequence of more than 7240 zero bytes

rnauber opened this issue · 3 comments

Hi,
If I compress a sequence of zeros, the decompress will crash if it is longer than 7240:

import blosc2

blosc2.print_versions()
data = bytearray(78732)  # crashes
data = bytearray(7241)  # crashes
# data = bytearray(7240)#works
# data = bytearray(100) #works
clevel = 9
cdata = blosc2.compress(data, clevel=clevel)
uncomp = blosc2.decompress(cdata)
assert data == uncomp

gives:

/usr/bin/python3.8 /home/olg/scratch/imaging/components/generic/blochs_test.py
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
python-blosc2 version: 0.2.0
Blosc version: 2.0.4 ($Date:: 2021-10-02 #$)
Compressors available: ['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd']
Compressor library versions:
  blosclz: 2.5.1
  lz4: 1.9.3
  lz4hc: 1.9.3
  zlib: 1.2.11.zlib-ng
  zstd: 1.5.0
Python version: 3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0]
Traceback (most recent call last):
  File "/home/olg/scratch/imaging/components/generic/blochs_test.py", line 10, in <module>
    uncomp = blosc2.decompress(cdata)
  File "/usr/local/lib/python3.8/dist-packages/blosc2/core.py", line 170, in decompress
    return blosc2_ext.decompress(src, dst, as_bytearray)
  File "blosc2_ext.pyx", line 458, in blosc2.blosc2_ext.decompress
RuntimeError: Cannot decompress
Platform: Linux-5.13.0-25-generic-x86_64 (#26-Ubuntu SMP Fri Jan 7 15:48:31 UTC 2022)
Linux dist: Ubuntu 20.04.3 LTS
Processor: x86_64
Byte-ordering: little
Detected cores: 4
Number of threads to use by default: 4
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

What can I do?

Thank you for your help!
Best,
Richard

and also v0.3.0 gives the same error:

/usr/bin/python3.8 /home/olg/scratch/imaging/components/generic/blochs_test.py
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
python-blosc2 version: 0.3.0
Blosc version: 2.2.0 ($Date:: 2022-07-05 #$)
Compressors available: ['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd']
Compressor library versions:
  blosclz: 2.5.1
  lz4: 1.9.3
  lz4hc: 1.9.3
  zlib: 1.2.11.zlib-ng
  zstd: 1.5.2
Python version: 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0]
Platform: Linux-5.15.0-41-generic-x86_64 (#44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022)
Linux dist: Ubuntu 20.04.4 LTS
Processor: x86_64
Byte-ordering: little
Detected cores: 4
Number of threads to use by default: 4
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Traceback (most recent call last):
  File "/home/olg/scratch/imaging/components/generic/blochs_test.py", line 10, in <module>
    uncomp = blosc2.decompress(cdata)
  File "/usr/local/lib/python3.8/dist-packages/blosc2/core.py", line 169, in decompress
    return blosc2_ext.decompress(src, dst, as_bytearray)
  File "blosc2_ext.pyx", line 451, in blosc2.blosc2_ext.decompress
RuntimeError: Cannot decompress

Process finished with exit code 1

Yep, I can reproduce this. This has been fixed in bd0a7eb. The issue here was that itemsize was 8 by default, but that is not a good value for bytearray objects. The fix was to set itemsize to 1 by default, and to src.itemsize otherwise. In case you don't want to re-install the package, a good workaround for your case is:

cdata = blosc2.compress(data, typesize=1, clevel=clevel)

Great, thanks for the fix and the workaround, @FrancescAlted !
Best,
Richard