yangyanli/PointCNN

issue about the tf_sampling_compile.sh

YamingZ opened this issue · 11 comments

when I compile tf_sampling_so.so file some warning happened:
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
but the tf_sampling_so.so compiled successfully
then I run the command:
./train_val_shapenet.sh -g 0 -x shapenet_x8_2048_fps
error messages in pointcnn_seg_shapenet_x8_2048_fps.txt like that:
Traceback (most recent call last):
File "../train_val_seg.py", line 295, in
main()
File "../train_val_seg.py", line 127, in main
net = model.Net(points_augmented, features_augmented, is_training, setting)
File "/home/whf/ZYM/PointCNN/pointcnn_seg.py", line 11, in init
PointCNN.init(self, points, features, is_training, setting)
File "/home/whf/ZYM/PointCNN/pointcnn.py", line 64, in init
from sampling import tf_sampling
File "/home/whf/ZYM/PointCNN/sampling/tf_sampling.py", line 15, in
sampling_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_sampling_so.so'))
File "/home/whf/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/whf/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /home/whf/ZYM/PointCNN/sampling/tf_sampling_so.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

My environment as follows:
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.4)
tensorflow 1.6.0

I meet the same question with you. And I solve it following the blog https://blog.csdn.net/DuinoDu/article/details/71788484?locationNum=9&fps=1.
Maybe you can try change from '-D_GLIBCXX_USE_CXX11_ABI=0' to '-D_GLIBCXX_USE_CXX11_ABI=1' in the g++ complie command.

@YamingZ Hi

I suggest you update your gcc to 5.0+ version and recompile it. You can also try the method as @latstars said.

Thanks!

I meet the same question with you. And I solve it following the blog https://blog.csdn.net/DuinoDu/article/details/71788484?locationNum=9&fps=1.
Maybe you can try change from '-D_GLIBCXX_USE_CXX11_ABI=0' to '-D_GLIBCXX_USE_CXX11_ABI=1' in the g++ complie command.

I have tried this approach, but it still doesn't work.

@YamingZ Did you update your gcc to 5.0+ and use our original .sh file to compile

@YamingZ Did you update your gcc to 5.0+ and use our original .sh file to compile

Yes,I did, I use the gcc-5.5.0 to compile tf_sampling_so.so

this is my tf_sampling_compile.sh file
#/bin/bash PYTHON=python3 CUDA_PATH=/usr/local/cuda TF_LIB=$($PYTHON -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())') PYTHON_VERSION=$($PYTHON -c 'import sys; print("%d.%d"%(sys.version_info[0], sys.version_info[1]))') TF_PATH=$($PYTHON -c 'import tensorflow as tf; print(tf.sysconfig.get_include())') $CUDA_PATH/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -L$TF_LIB -ltensorflow_framework -I $TF_PATH/external/nsync/public/ -I $TF_PATH -I $CUDA_PATH/include -lcudart -L $CUDA_PATH/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=1

@YamingZ
After update gcc to 5.0, please using "-D_GLIBCXX_USE_CXX11_ABI=0" instead of "-D_GLIBCXX_USE_CXX11_ABI=1"

thanks for your helping ,But I still can't solve this issue,I have tried your suggestions all above。

@YamingZ
Sorry, I can't replicate your error, so I don't have other idea to solve this problem. You can refer to this link https://github.com/charlesq34/pointnet2 and try compile it again if possible.
Thanks

thank you,I create a new envirnment ,reinstall TF, CUDA and cuDNN, now I get out of trouble,your model can be trained successfully