Cloud-Drift/clouddrift

๐Ÿ› binder notebook gdp-6hourly is broken

Closed this issue ยท 2 comments

Hi there,

I just tried the first binder notebook (gdp-6hourly.ipynb). Cell number 11 fails with this error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/_dispatch.py:60, in named_high_level_function.<locals>.dispatch(*args, **kwargs)
     59 try:
---> 60     next(gen_or_result)
     61 except StopIteration as err:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/operations/ak_to_numpy.py:44, in to_numpy(array, allow_missing)
     43 # Implementation
---> 44 return _impl(array, allow_missing)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/operations/ak_to_numpy.py:56, in _impl(array, allow_missing)
     54 numpy_layout = layout.to_backend(backend)
---> 56 return numpy_layout.to_backend_array(allow_missing=allow_missing)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/contents/content.py:1077, in Content.to_backend_array(self, allow_missing, backend)
   1076     backend = regularize_backend(backend)
-> 1077 return self._to_backend_array(allow_missing, backend)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/contents/listoffsetarray.py:2042, in ListOffsetArray._to_backend_array(self, allow_missing, backend)
   2040 buffer = backend.nplike.empty(max_count * self.length, dtype=np.uint8)
-> 2042 self.backend[
   2043     "awkward_NumpyArray_pad_zero_to_length",
   2044     self._content.dtype.type,
   2045     self._offsets.dtype.type,
   2046     buffer.dtype.type,
   2047 ](
   2048     self._content.data,
   2049     self._offsets.data,
   2050     self._offsets.length,
   2051     max_count,
   2052     buffer,
   2053 )
   2054 return buffer.view(np.dtype(("S", max_count)))

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/_backends/numpy.py:35, in NumpyBackend.__getitem__(self, index)
     34 def __getitem__(self, index: KernelKeyType) -> NumpyKernel:
---> 35     return NumpyKernel(awkward_cpp.cpu_kernels.kernel[index], index)

KeyError: ('awkward_NumpyArray_pad_zero_to_length', <class 'numpy.uint8'>, <class 'numpy.int32'>, <class 'numpy.uint8'>)

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[11], line 1
----> 1 ra = RaggedArray.from_parquet("../data/process/gdp_6h.parquet")

File /srv/conda/envs/notebook/lib/python3.10/site-packages/clouddrift/raggedarray.py:166, in RaggedArray.from_parquet(cls, filename, name_coords)
    148 @classmethod
    149 def from_parquet(
    150     cls, filename: str, name_coords: Optional[list] = ["time", "lon", "lat", "ids"]
    151 ):
    152     """Read a ragged array from a parquet file.
    153 
    154     Parameters
   (...)
    164         A ragged array instance
    165     """
--> 166     return cls.from_awkward(ak.from_parquet(filename), name_coords)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/clouddrift/raggedarray.py:63, in RaggedArray.from_awkward(cls, array, name_coords)
     60     attrs_variables[var] = array.obs[var].layout.parameters["attrs"]
     62 for var in [v for v in array.fields if v != "obs"]:
---> 63     metadata[var] = array[var].to_numpy()
     64     attrs_variables[var] = array[var].layout.parameters["attrs"]
     66 for var in [v for v in array.obs.fields if v not in name_coords]:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/highlevel.py:459, in Array.to_numpy(self, allow_missing)
    455 def to_numpy(self, allow_missing=True):
    456     """
    457     Converts this Array into a NumPy array, if possible; same as #ak.to_numpy.
    458     """
--> 459     return ak.operations.to_numpy(self, allow_missing=allow_missing)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/_dispatch.py:36, in named_high_level_function.<locals>.dispatch(*args, **kwargs)
     33 @wraps(func)
     34 def dispatch(*args, **kwargs):
     35     # NOTE: this decorator assumes that the operation is exposed under `ak.`
---> 36     with OperationErrorContext(name, args, kwargs):
     37         gen_or_result = func(*args, **kwargs)
     38         if isgenerator(gen_or_result):

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/_errors.py:67, in ErrorContext.__exit__(self, exception_type, exception_value, traceback)
     60 try:
     61     # Handle caught exception
     62     if (
     63         exception_type is not None
     64         and issubclass(exception_type, Exception)
     65         and self.primary() is self
     66     ):
---> 67         self.handle_exception(exception_type, exception_value)
     68 finally:
     69     # `_kwargs` may hold cyclic references, that we really want to avoid
     70     # as this can lead to large buffers remaining in memory for longer than absolutely necessary
     71     # Let's just clear this, now.
     72     self._kwargs.clear()

File /srv/conda/envs/notebook/lib/python3.10/site-packages/awkward/_errors.py:82, in ErrorContext.handle_exception(self, cls, exception)
     80     self.decorate_exception(cls, exception)
     81 else:
---> 82     raise self.decorate_exception(cls, exception)

KeyError: ('awkward_NumpyArray_pad_zero_to_length', <class 'numpy.uint8'>, <class 'numpy.int32'>, <class 'numpy.uint8'>)

This error occurred while calling

    ak.to_numpy(
        <Array [b'SVPB   ', b'SVP    ', ..., b'SVPB   '] type='100 * bytes'>
        allow_missing = True
    )

I haven't tried running the notebook locally, but I noticed that an old version of clouddrift (0.21.2) is installed on Binder. This might be a similar issue to #449, where the usage code is outdated. If that's the case, I think you need to automate the way you handle and publish the demos.

Let me know when I can try again.

JOSS: openjournals/joss-reviews#6742

Once we release a new version, it should be easier to fix with #454.

We have re-evaluated the usefulness of this notebook and have decided that it was not the best to showcase the library and we have therefore made this repo s private for now with the intent on revisiting these later. The goals of these notebooks was to explain how to build data adapters for Lagrangian datasets but these methods are not destined for the use of new users of the library. In the readme of the library we now emphasize and link three other notebooks that are better for first users.