CS-GangXu/TMNet

When I compile, errors come out

InstantWindy opened this issue · 6 comments

cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/nvcc -DWITH_CUDA -I/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/TH -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC -I/home/pcl/anaconda3/envs/match/include/python3.6m -c /home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.cu -o build/temp.linux-x86_64-3.6/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/c10/util/Half-inl.h(21): error: identifier "__half_as_short" is undefined

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(83): warning: calling a constexpr host function("from_bits") from a host device function("lowest") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(84): warning: calling a constexpr host function("from_bits") from a host device function("max") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(85): warning: calling a constexpr host function("from_bits") from a host device function("lower_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(86): warning: calling a constexpr host function("from_bits") from a host device function("upper_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC/THCNumerics.cuh(195): error: identifier "__half_as_ushort" is undefined

4 errors detected in the compilation of "/tmp/tmpxft_00000510_00000000-7_dcn_v2_im2col_cuda.cpp1.ii".
error: command '/usr/bin/nvcc' failed with exit status 2

You can try to modify the CUDA paths of the models/modules/DCNv2/make.sh to:

#!/usr/bin/env bash

# You may need to modify the following paths before compiling.
CUDA_HOME=/usr/local/cuda-10.0 \
CUDNN_INCLUDE_DIR=/usr/local/cuda-10.0/include \
CUDNN_LIB_DIR=/usr/local/cuda-10.0/lib64 \

python setup.py build develop

Thanks. I solved it, but I use this repo to train my own dataset. The loss value is very large, up to 1e6.

@InstantWindy Can you try to train the model with a smaller learning rate?

Yeah, I try to use 4e-5,4e-6, but it doesn't work.

Sorry for the late reply @InstantWindy , in my training process, the loss value is about 1e+04. This is mainly because that we set the 'reduction' to 'sum' in the loss function. I recommend that you can check the valid metric to see whether the training process is going well.