Error "libcudart.so.11.0: cannot open shared object file" when using Docker image
Opened this issue · 2 comments
I've been trying to train the LIVECELL anchor-based model with my dataset, but the model failed to start learning.
I used Docker image pytorch/pytorch:1.5-cuda10.1-cudnn7-devel
to match the versions you mentioned in the paper.
Then I got the error saying "libcudart.so.11.0: cannot open shared object file: No such file or directory".
The error traceback is as follows:
Traceback (most recent call last):
File "train_net.py", line 27, in <module>
from detectron2.data import MetadataCatalog
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/data/__init__.py", line 4, in <module>
from .build import (
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/data/build.py", line 14, in <module>
from detectron2.structures import BoxMode
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/structures/__init__.py", line 6, in <module>
from .keypoints import Keypoints, heatmaps_to_keypoints
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/structures/keypoints.py", line 6, in <module>
from detectron2.layers import interpolate
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/layers/__init__.py", line 3, in <module>
from .deform_conv import DeformConv, ModulatedDeformConv
File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/layers/deform_conv.py", line 10, in <module>
from detectron2 import _C
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
This is probably because the CUDA toolkit version inside Docker image (10.1) mismatches that of Detecton2-ResNest (11.x?).
Should I specify the version of Detectron2-ResNest?
Environment
Hardware
OS: Ubuntu 20.04.5 LTS on WSL 2
CPU: Intel Core i9-10940X
GPU:NVIDIA TITAN RTX(Turing architecture)
DRAM: 100GB
nvidia-smi
Hi @tsh11na,
It might be the case that it is the version Detectron2-ResNest that is causing problems and I see that the version of it is not specified in the repo.
@nabeelkhalid92, can you help out with which version you used?
Hi @tsh11na,
You have to install the detectron2 with the same Cuda version i.e., 10.1.
You can find the matching detectron2 versions from here: detectron2 installations
Also, the anchor-based model was implemented using the Python programming language v.3.6.10, the deep learning framework PyTorch v.1.5.0, and the object detection library Detectron2 v.2.1.