bug in save for variable length arrays?
JamiePringle opened this issue · 2 comments
This is related to the issue in #691, but with more specifics. Saving a variable length (ragged) array with zarr.save fails, but creating the file without the convenience function works. I am running python 3.9.12 and zarr 2.11.3, and when I try to run
import zarr
import numcodecs
import numpy as np
z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
z[3] = np.array([1,1])
zarr.save('jnk.zarr',z)
It fails with a traceback that culminates with
File ~/anaconda3/envs/py3_parcels_mpi_bleedingApr2022/lib/python3.9/site-packages/zarr/storage.py:427, in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec, dimension_separator)
424 if object_codec is None:
425 if not filters:
426 # there are no filters so we can be sure there is no object codec
--> 427 raise ValueError('missing object_codec for object array')
428 else:
429 # one of the filters may be an object codec, issue a warning rather
430 # than raise an error to maintain backwards-compatibility
431 warnings.warn('missing object_codec for object array; this will raise a '
432 'ValueError in version 3.0', FutureWarning)
ValueError: missing object_codec for object array
This makes it rather hard to use ragged arrays... Am I doing something dumb? Or is something broken? What I really need to do is write ragged arrays to zarr data stores. When I type z.filters it returns [VLenArray(dtype='<i8')]
.
However, if I manually create the data store, it works fine -- the following code works:
import zarr
import numcodecs
import numpy as np
store = zarr.DirectoryStore('jnkStore.zarr')
root=zarr.group(store=store)
z = root.empty(shape=(4,),name='z',dtype=object, object_codec=numcodecs.VLenArray(int))
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
z[3] = np.array([1,1])
Can you please also include conda list
or pip list
as appropriate?
Attached is the output of conda list. Cheers, Jamie