[BUILD] build failed with GPU configuration
cyberkillor opened this issue · 1 comments
System information
- OS CPU: AMD EPYC 7V12 64-Core Processor
- Build image: alideeprec/deeprec-build:deeprec-dev-gpu-py38-cu116-ubuntu20.04, and use nvidia-docker
- OS Platform and Distribution (e.g., Linux Ubuntu 20.04): CentOS Linux release 7.9.2009 (Core)
- DeepRec version or commit id: 29ecde4
- Python version: 3.8.10
- Bazel version (if compiling from source): 5.3.1 (build from source)
- GCC/Compiler version (if compiling from source): 9.4
- CUDA/cuDNN version: 11.6
- GPU: Tesla T4
- GPU Driver version: 470.161.03
.tf_configure.bazelrc
:
build --python_path="/usr/bin/python" # python 3.8.10
build:xla --define with_xla_support=true
build --config=xla
build:star --define with_star_support=true
build --config=star
build:pmem --define with_pmem_support=true
build:parquet_dataset --define with_parquet_dataset_support=true
build --config=parquet_dataset
build:api_compatible --define with_api_compatible=true
build --action_env TF_USE_CCACHE="0"
build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda"
build --action_env TF_CUDA_COMPUTE_CAPABILITIES="7.5"
build --action_env LD_LIBRARY_PATH="/usr/local/cuda/compat:/usr/local/nvidia/lib:/usr/local/nvidia/lib64"
build --action_env GCC_HOST_COMPILER_PATH="/usr/bin/gcc"
build --config=cuda
build:opt --copt=-march=native
build:opt --copt=-Wno-sign-compare
build:opt --host_copt=-march=native
build:opt --define with_default_optimizations=true
build:v2 --define=tf_api_version=2
test --flaky_test_attempts=3
test --test_size_filters=small,medium
test --test_tag_filters=-benchmark-test,-no_oss,-oss_serial
test --build_tag_filters=-benchmark-test,-no_oss
test --test_tag_filters=-gpu
test --build_tag_filters=-gpu
build --action_env TF_CONFIGURE_IOS="0"
build --config=noaws
build --config=nogcp
build --config=noignite
build --config=nokafka
build --config=numa
Describe the problem
build with cmd bazel build -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package
, show error:
I found std::__cxx11::basic_string
, so I try to build with bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package
, show error:
But if I annotate these lines:
#build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda"
#build --action_env TF_CUDA_COMPUTE_CAPABILITIES="7.5"
#build --action_env LD_LIBRARY_PATH="/usr/local/cuda/compat:/usr/local/nvidia/lib:/usr/local/nvidia/lib64"
#build --action_env GCC_HOST_COMPILER_PATH="/usr/bin/gcc"
#build --config=cuda
and build cpu version with bazel build -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package
or bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package
. It can compile.
fixed by bazel build --config=monolithic --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package
(add --config=monolithic
)