fill values are not preserved in rechunking.
flamingbear opened this issue · 2 comments
This is a follow on from the previous report, #131.
I also noticed that when looking at diffs between the source.zarr and target.zarr after running through rechunker that the fill_values are not preserved. Below is basically same script as #131 with an additional call to consolidate_metadata
If you run this script you will see the fillvalue of "foo/bar/.zarray" changes from "fill_value": 1.0,
to "fill_value": null,
between the source and target zarr stores.
Thanks,
Matt
import zarr
from rechunker import rechunk
import shutil
def run_create_input_store():
shutil.rmtree('testoutput/', ignore_errors=True)
store = zarr.DirectoryStore('testoutput/source.zarr')
root = zarr.group(store=store, overwrite=True)
foo = root.create_group('foo')
root.attrs['description'] = 'root description'
foo.attrs['description'] = 'foo description'
bar = foo.ones('bar', shape=(10, 10))
bar[5, 5] = 3
bar.attrs['description'] = 'foo description'
zarr.consolidate_metadata(store)
def rechunkit():
openstore = zarr.open_consolidated('testoutput/source.zarr')
array_plan = rechunk(openstore, {'foo/bar': (5, 5)},
'1MB',
'testoutput/target.zarr',
temp_store='testoutput/temp.zarr')
array_plan.execute()
zarr.consolidate_metadata('testoutput/target.zarr')
if __name__ == '__main__':
run_create_input_store()
rechunkit()
print('Compare the .zmetadata files in both your source.zarr and target.zarr directories')
print('You will see that the "fill_value" in the source is 1.0 and it is null in the target.')
source = zarr.open('testoutput/source.zarr')
target = zarr.open('testoutput/target.zarr')
print(source['foo']['bar'].fill_value)
print(target['foo']['bar'].fill_value)
Maybe the fill value is put into the data? The grids themselves are looking the same. I will check my other "real" output.
I'm closing this and will re-open a different one, relating to the same issue, but that shows the problem better.