pangeo-data/rechunker

Rechunk to an existing store

jbeezley opened this issue · 0 comments

I have an existing data pipeline where I have data coming in incrementally. I have an existing pipeline performing a naive rechunking to a zarr store whenever new data comes into the source store. Rechunker has a much better algorithm I would like to use, but it doesn't have the ability to target an existing store.

This problem seems related to #8 however, for my use case a simpler implementation would be to optionally skip the call at https://github.com/pangeo-data/rechunker/blob/master/rechunker/api.py#L599 and open the dataset instead.

I would be willing to implement this via an optional kwarg, but I wanted to check if such a change would be accepted or if there are any issues with it that I'm not considering. Clearly, there could be problems if the dimensions/variables of the destination are not compatible. I could check that after opening or just let the exceptions from zarr pass through. Thoughts?