NVlabs/earth2grid

GPU Tensor -> Numpy Bug

dallasfoster opened this issue · 1 comments

There are two places where torch.Tensor.numpy() is called where the originating tensors can be located on the GPU. I do not believe that these tensors are intended to ever be placed on the GPU but if torch.set_device() is called before these tensors are created then they are automatically placed there and calling .numpy() errors.

ipix = torch.from_numpy(self._nest_ipix())

ipix = torch.from_numpy(self._nest_ipix())

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/code/earth2studio/models/px/dlwp_healpix.py", line 212, in load_model
    rect_hpx_grid = earth2grid.get_regridder(rect, hpx).to(torch.float32)
  File "/usr/local/lib/python3.10/dist-packages/earth2grid/_regrid.py", line 63, in get_regridder
    return src.get_bilinear_regridder_to(dest.lat, dest.lon)
  File "/usr/local/lib/python3.10/dist-packages/earth2grid/healpix.py", line 214, in lat
    ipix = torch.from_numpy(self._nest_ipix())
  File "/usr/local/lib/python3.10/dist-packages/earth2grid/healpix.py", line 198, in _nest_ipix
    return i.numpy()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_device.py", line 76, in __torch_function__
    return func(*args, **kwargs)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Thanks for the report. Indeed, these need to happen on the cpu. Moreover, self._nest_ipix could probably be a tensor now since we no longer use healpy.