/cubed-xarray

Interface for using cubed with xarray

Primary LanguagePythonApache License 2.0Apache-2.0

Note: this is a proof-of-concept, and many things are incomplete, untested, or don't work.

cubed-xarray

Interface for using cubed with xarray.

Requirements

  • Cubed version >=0.6.3
  • Xarray version >=2023.05.0

Installation

Install via pip

pip install cubed-xarray

or conda

conda install -c conda-forge cubed-xarray

Importing

You don't need to import this package in user code. Once poperly installed, xarray should automatically become aware of this package via the magic of entrypoints.

Usage

Xarray objects backed by cubed arrays can be created either by:

  1. Passing existing cubed.Array objects to the data argument of xarray constructors,
  2. Calling .chunk on xarray objects,
  3. Passing a chunks argument to xarray.open_dataset.

In (2) and (3) the choice to use cubed.Array instead of dask.array.Array is made by passing the keyword argument chunked_array_type='cubed'. To pass arguments to the constructor of cubed.Array you should pass them via the dictionary from_array_kwargs, e.g. from_array_kwargs={'spec': cubed.Spec(allowed_mem='2GB')}.

If cubed and cubed-xarray are installed but dask is not, then specifying chunked_array_type is not necessary, as the entrypoints system will then default to the only chunked parallel backend available (i.e. cubed).

Sharp Edges 🔪

Some things almost certainly won't work yet:

and some other things might work but have not yet been tried:

  • Saving to formats other than zarr

In general a bug could take the form of an error, or of a silent attempt to coerce the array type to numpy by immediately computing the underlying array.

Tests

Integration tests for wrapping cubed with xarray also live in this repository.