ufoym/deepo

tensorflow 2.5.0 CUDA compatibility

syncdoth opened this issue · 6 comments

The latest versions use tensorflow==2.5.0, CUDA==10.2, cudnn7.

Apparently, tensorflow==2.5.0 seems to be not compatible with CUDA==10.2.

full log:

2021-06-22 06:10:01.725884: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-06-22 06:10:01.725969: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2021-06-22 06:10:01.725984: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-06-22 06:10:01.864938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-22 06:10:01.864983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-06-22 06:10:01.864993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-06-22 06:10:02.428699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:3e:00.0 name: Tesla V100-SXM2-32GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-22 06:10:02.428764: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-06-22 06:10:02.429027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-22 06:10:02.429043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]

GPU test:

import tensorflow as tf
print(tf.__version__)
print(tf.test.is_gpu_available())

>>> 2.5.0
>>> False

A simple workaround was downgrading to tensorflow-gpu==2.3.0, which was still OK in my project.

I just wanted to raise that I had the same issue, and this solved it for me. Let's get the CUDA version updated!

ufoym commented

Sorry for any inconvenience. CUDA is now upgraded to 11.1 by default.

Ah I think the issue I was having then is the “all-jupyter” tag isn’t pointing to the latest cu110 version. Lemme check docker hub

Yeah the “all-jupyter” tag is still pointing to the same digest as “all-jupyter-cu101”

ufoym commented

@vanakema Sorry for that mistake! We will fix it ASAP.

ufoym commented

Fixed. Feel free to reopen this issue if problem still exist.