Require using make_template() if providing a template to ChunksToZarr?
shoyer opened this issue · 2 comments
Currently we support passing an xarray.Dataset
full of chunked dask.array objects as template
into ChunksToZarr
.
This is convenient in simple cases, but makes it easy to write pipelines that are super slow to setup, if you pass in a chunked Dataset with many small chunks (e.g., the default output of xarray.open_zarr()
).
The breaking change here would be to require that the template
argument was created via make_template()
, by checking that each dask.array argument in the supplied Dataset only consists of a single chunk. We would also make zarr_chunks
required when supplying a template
, because it makes no sense to copy chunks from a template if using make_template
.
As an alternative, we could instead perhaps use make_template() internally inside ChunksToZarr.