CompRhys/aviary

Roost Colab default Cuda version issue

Closed this issue · 4 comments

Tried running the Roost example Colab and got an error that seems it's probably related to Colab now using CUDA 11.2.

OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory
stack trace
OSError                                   Traceback (most recent call last)
[<ipython-input-10-fd45f7ae93a3>](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in <module>()
      1 from aviary.roost.data import CompositionData, collate_batch as roost_cb
----> 2 from aviary.roost.model import Roost
      3 
      4 torch.manual_seed(0)  # ensure reproducible results
      5 

4 frames
[/usr/local/lib/python3.7/dist-packages/aviary/roost/model.py](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in <module>()
      4 
      5 from aviary.core import BaseModelClass
----> 6 from aviary.segments import (
      7     MessageLayer,
      8     ResidualNetwork,

[/usr/local/lib/python3.7/dist-packages/aviary/segments.py](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in <module>()
      1 import torch
      2 import torch.nn as nn
----> 3 from torch_scatter import scatter_add, scatter_max, scatter_mean
      4 
      5 

[/usr/local/lib/python3.7/dist-packages/torch_scatter/__init__.py](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in <module>()
     14     spec = cuda_spec or cpu_spec
     15     if spec is not None:
---> 16         torch.ops.load_library(spec.origin)
     17     elif os.getenv('BUILD_DOCS', '0') != '1':  # pragma: no cover
     18         raise ImportError(f"Could not find module '{library}_cpu' in "

[/usr/local/lib/python3.7/dist-packages/torch/_ops.py](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in load_library(self, path)
    108             # static (global) initialization code in order to register custom
    109             # operators with the JIT.
--> 110             ctypes.CDLL(path)
    111         self.loaded_libraries.add(path)
    112 

[/usr/lib/python3.7/ctypes/__init__.py](https://z3go6q25tqk-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220217-060102-RC00_429270882#) in __init__(self, name, mode, handle, use_errno, use_last_error)
    362 
    363         if handle is None:
--> 364             self._handle = _dlopen(self._name, mode)
    365         else:
    366             self._handle = handle

OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

Here's a solution that seems future-proof:

import torch

def format_pytorch_version(version):
  return version.split('+')[0]

TORCH_version = torch.__version__
TORCH = format_pytorch_version(TORCH_version)

def format_cuda_version(version):
  return 'cu' + version.replace('.', '')

CUDA_version = torch.version.cuda
CUDA = format_cuda_version(CUDA_version)
!pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-{TORCH}+{CUDA}.html  # install torch scatter for aviary

Thanks for reporting let me play around and see what's up with it

@jdagdelen I think i've patched it for the time being to work. Can you check that they work?

Looks good to me!