DeepRec-AI/DeepRec

DeepRec support cuda 10

welsonzhang opened this issue · 1 comments

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 20.04): Ubuntu 16.04
  • DeepRec version or commit id: main
  • Python version: python3.7
  • Bazel version (if compiling from source): 0.26.1
  • GCC/Compiler version (if compiling from source): gcc version 7.5.0
  • CUDA/cuDNN version: cuda 10.2 cudnn 7.6

Describe the problem
Under an environment of CUDA version 10.2 and cuDNN version 7.6, when compiling DeepRec, an error stating "No such file or directory" occurs, providing the specific details as follows:

/root/.cache/bazel/_bazel_root/e5dd34e735e9b22c055e30807c86bf9e/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/core/_objs/embedding_gpu/gpu_hash_table.cu.pic.d (No such file or directory)
In file included from tensorflow/core/framework/embedding/gpu_hash_table.cu.cc:25:0:
external/cuCollections/include/cuco/dynamic_map.cuh:21:23: fatal error: cub/cub.cuh: No such file or directory 

Provide the exact sequence of commands / steps that you executed before running into the problem

  1. install cuda and cudnn
wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

https://developer.nvidia.com/rdp/cudnn-archive
tar -xzvf cudnn-xxx.tar.gz
sudo cp cuda/include/* /usr/local/cuda/include
sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
  1. bazel build DeepRec
bazel build -c opt --config=opt //tensorflow:libtensorflow_cc.so

Any other info / logs

Include any logs or source code that would be helpful to diagnose the problem.

Good catch, currently DeepRec CICD is built on CUDA11, we haven't catch up the compatibility issue in CUDA 10.

Anyway, we suggest you upgrade to CUDA11 or CUDA12 (which also supported in DeepRec), which perform better performance.