
ZeroDivisionError in L70 of

norlandrhagen opened this issue ยท 5 comments

Hi there,

I'm having an issue when trying to rechunk a 4d zarr store. In the traceback at L70 of (70 headroom = max_mem // chunk_mem), I'm getting the error: ZeroDivisionError: integer division or modulo by zero.

The snippet below should be a MRE.

import xarray as xr
import fsspec
import rechunker 
import zarr 

store_url = "gs://cmip6/CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Omon/thetao/gn/v20190429"
ds = xr.open_dataset(store_url, engine='zarr', chunks={}, decode_cf=False)
group = zarr.open_consolidated(store_url)

chunks_dict = {
'i': {'i': 128},
 'j': {'j': 128},
 'latitude': {'i': 128, 'j': 128},
 'lev': {'lev': -1},
 'lev_bnds': {'lev': -1, 'bnds': -1},
 'longitude': {'i': 128, 'j': 128},
 'thetao': {'i': 128, 'j': 128, 'time': 2, 'lev': -1},
 'time': {'time': 2},
 'time_bnds': {'time': 2, 'bnds': -1},
 'vertices_latitude': {'i': 128, 'j': 128, 'vertices': -1},
 'vertices_longitude': {'i': 128, 'j': 128, 'vertices': -1}

tmp_mapper = fsspec.get_mapper('temp_store')
tgt_mapper = fsspec.get_mapper('staging_store')

array_plan = rechunker.rechunk(group, chunks_dict, "1000MB", tgt_mapper, temp_store=tmp_mapper)

A few notes:

  • It seems like the ZeroDivisionError happens on the thetao var.
  • In L70 of, in the headroom = max_mem // chunk_mem calculation, somehow chunk_mem is being assigned to 0 in the above for loop.
    chunk_mem = itemsize * prod(chunks)
    if chunk_mem > max_mem:
        raise ValueError(f"chunk_mem {chunk_mem} > max_mem {max_mem}")
    headroom = max_mem // chunk_mem

    new_chunks = list(chunks)
    # only consolidate over these axes
    axes = sorted(chunk_limit_per_axis.keys())[::-1]
    for n_axis in axes:
        c_new = min(
            chunks[n_axis] * headroom, shape[n_axis], chunk_limit_per_axis[n_axis]
        # print(f'  axis {n_axis}, {chunks[n_axis]} -> {c_new}')
        new_chunks[n_axis] = c_new
        chunk_mem = itemsize * prod(new_chunks)
        headroom = max_mem // chunk_mem


ZeroDivisionError                         Traceback (most recent call last)
Cell In[11], line 2
      1 # %pdb on 
----> 2 array_plan = rechunker.rechunk(group, chunks_dict, "1000MB", tgt_mapper, temp_store=tmp_mapper)

File ~/opt/anaconda3/envs/install/envs/ncviewjs/lib/python3.10/site-packages/rechunker/, in rechunk(source, target_chunks, max_mem, target_store, target_options, temp_store, temp_options, executor)
    299 if isinstance(executor, str):
    300     executor = _get_executor(executor)
--> 302 copy_spec, intermediate, target = _setup_rechunk(
    303     source=source,
    304     target_chunks=target_chunks,
    305     max_mem=max_mem,
    306     target_store=target_store,
    307     target_options=target_options,
    308     temp_store=temp_store,
    309     temp_options=temp_options,
    310 )
    311 plan = executor.prepare_plan(copy_spec)
    312 return Rechunked(executor, plan, source, intermediate, target)

File ~/opt/anaconda3/envs/install/envs/ncviewjs/lib/python3.10/site-packages/rechunker/, in _setup_rechunk(source, target_chunks, max_mem, target_store, target_options, temp_store, temp_options)
    454 copy_specs = []
    456 for array_name, array_target_chunks in target_chunks.items():
--> 457     copy_spec = _setup_array_rechunk(
---> 70 headroom = max_mem // chunk_mem
     72 if headroom == 1:
     73     break

ZeroDivisionError: integer division or modulo by zero

Results of conda list:

Wondering if anyone has any thoughts on why this might be happening.


Thanks for sharing. Can reproduce it even more simply with

target_chunks = (2, -1, 128, 128)
rechunker.rechunk(group['thetao'], target_chunks, "1000MB", tgt_mapper, temp_store=tmp_mapper)

I'll investigate.

So it looks like we can fix it with

target_chunks = (2, 45, 128, 128)

My impression is that we don't support the syntax -1 to mean "the full size of the dimension." This could easily be fixed. Or we could raise an error if you pass a chunk shape as a negative number.

Thanks for the wildly quick response @rabernat!

Didn't realize the -1 means the full size on the dim wasn't supported. Either of those options sounds like an improvement to me!

PR welcome! ๐Ÿ˜‰

The check would go here:

new_chunks = list(chunks)

Just replace the special value of -1 with the corresponding shape (i.e. swap -1 for 45 like I did above), something like

new_chunks = [s if c == -1 else c for c, s in zip(chunks, shape)]

Sounds great! I'll open up a PR.