make ops error: No rule to make target libculibos.a when compile libcaffe2_detectron_custom_ops_gpu.so

Question

make ops error: No rule to make target libculibos.a when compile libcaffe2_detectron_custom_ops_gpu.so

Johnqczhang opened this issue 6 years ago · 20 comments

When I run "make ops" I encountered many errors which made me very sad, most of them were solved in #152, thanks to @linkinpark213. However, here is a new error I met which was not emerged in that issue. The error message looks like this:

[ 12%] Building NVCC (Device) object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o
[ 25%] Building NVCC (Device) object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o
[ 37%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/pool_points_interp.cc.o
[ 50%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o
[ 62%] Linking CXX shared library libcaffe2_detectron_custom_ops.so
[ 62%] Built target caffe2_detectron_custom_ops
Scanning dependencies of target caffe2_detectron_custom_ops_gpu
make[2]: *** No rule to make target /usr/local/cuda/lib64/libculibos.a', needed by libcaffe2_detectron_custom_ops_gpu.so'. Stop.
make[2]: *** Waiting for unfinished jobs....
[ 75%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/pool_points_interp.cc.o
[ 87%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/zero_even_op.cc.o
make[1]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all] Error 2
make: *** [all] Error 2

The problem is when compiling the library "libcaffe_detectron_custom_ops_gpu.so", the compiler can't find the dependent library "libculibos.a" from the path "/usr/local/cuda/lib64". I found that this path in fact doesn't exist on my system since I didn't create a symbol link for a specific version of CUDA.

By checking the file "CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.cmake" generated by "cmake .." command, I found all paths related to cuda libraries were generated correctly with specefic cuda version (e.g., /usr/local/cuda-9.0/lib64/libcublas.so) except only this one (i.e., libculibos.a).

So, here's my solution which I think should solve this problem (but I still don't know why this happened, maybe it's a minor bug):

Execute cmake .. in $DENSEPOSE/build;
Check $DENSEPOSE/build/CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.cmake and $DENSEPOSE/build/CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/link.txt to make sure all dependencies can be found in the specified paths. (In my case, replace /usr/local/cuda/lib64 with /usr/local/cuda-9.0/lib64 in both files);
make -j8 (or simply make). After this, you can test your compilation by running $DENSEPOSE/detectron/tests/test_zero_even_op.py

Although this is not a big issue, but there's still a chance that someone may encounter it. So, hope this can help :)

Answer 1 · 2019-01-17T03:19:13.000Z

Since this issue has been solved and the reason mainly contributes to the compilation of caffe2 and detectron which is not very related to this project, so I close this issue. If necessary, feel free to reopen it. For someone who may encounter other issues, this blog post from @linkinpark213 can give many helpful solutions.

Another thing someone may need to notice is that after the compilation of this custom operator, it generates two dynamic libraries (libcaffe2_detectron_custom_ops.so and libcaffe2_detectron_custom_ops_gpu.so) in the path $DENSEPOSE/build. However, when I run the test program $DENSEPOSE/detectron/tests/test_zero_even_op.py for the first time, an AssertionError occurs which indicates that the program cannot find these two libraries in the path of $DETECTRON/build. So, I copy the files into this specified location and then the test program works fine. You can also put the path of $DENSEPOSE/build into the $PYTHONPATH environment variable before you run the test program.

BTW, I didn't installed (or compiled) caffe2 from source instead of installing the latest stable pytorch-1.0 from conda, in which caffe2 has already been integrated into. I installed Detectron successfully following instructions from official github and the demo can be run without any error.
System: CentOS
Python: 3.6 (from Anaconda)
CUDA: 9.0
cuDNN: 7.4.2

Answer 2 · 2019-01-22T02:52:45.000Z

Recently, I found the origin source which caused this problem. Because I installed caffe2 from conda install pytorch, so before compiling the ops, the cmake .. will load all .cmake files located in the path /path/to/my/env/lib/python2.7/site-packages/torch/share/cmake/Caffe2 and /usr/local/cuda/libculibos.a is specified in one of these cmake files named Caffe2Targets.cmake. So, the correct solution to this problem is replace /usr/local/cuda/libculibos.a with /usr/local/cuda-x.x/libculibos.a in case you have multiple CUDA installed in your system. After this, you don't need to check and edit the build.cmake and link.txt anymore.

Answer 3 · 2019-06-11T18:22:15.000Z

I don't know if this is related, but had similar issue while following dcgan tutorial and trying to enable GPU. (No problem with CPU.) I downloaded the latest stable build of tensorflow at
https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-latest.zip
I pointed cmake to the location of CUDNN, and after this was able to cmake. But on make, I get:

Scanning dependencies of target dcgan
[ 50%] Building CXX object CMakeFiles/dcgan.dir/dcgan.cpp.o
make[2]: *** No rule to make target '/usr/local/cuda/lib64/libculibos.a', needed by 'dcgan'. Stop.
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/dcgan.dir/all' failed
make[1]: *** [CMakeFiles/dcgan.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Answer 4 · 2019-08-26T12:14:50.000Z

@WillemvJ so what did you do?

Answer 5 · 2019-10-09T13:41:51.000Z

Hi @Johnqczhang, firstly, you have a brilliant guide to a very stressful installation, and for that you have my heartfelt thanks, without you and @linkinpark213 I know I wouldn't haven gotten as far as I have.

One question on your instructions, in the section "Build the custom ops library" at step3 in your guide, the first cmake command runs beautifully, however, the second command (the "make") fails by stating : cannot find -lcaffe2_gpu_library collect2: error: ld returned 1 exit status CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:1300: recipe for target 'libcaffe2_detectron_custom_ops_gpu.so' failed make[2]: *** [libcaffe2_detectron_custom_ops_gpu.so] Error 1 CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all' failed make[1]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all] Error 2 Makefile:129: recipe for target 'all' failed make: *** [all] Error 2
Did you encounter this as well in your process?

I'm on Ubuntu 1604, using Cuda 10.1 and Cudnn 7.6.4. Cmake is at 3.13.3 and caffe2 version 1.3.

Any ideas would be more than welcomed, and if I made this question incorrectly, please let me know and I'll put it in the correct format.

Many thanks fellow developers.

Answer 6 · 2019-10-10T08:49:39.000Z

Hi @Johnqczhang, firstly, you have a brilliant guide to a very stressful installation, and for that you have my heartfelt thanks, without you and @linkinpark213 I know I wouldn't haven gotten as far as I have.

One question on your instructions, in the section "Build the custom ops library" at step3 in your guide, the first cmake command runs beautifully, however, the second command (the "make") fails by stating : cannot find -lcaffe2_gpu_library collect2: error: ld returned 1 exit status CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:1300: recipe for target 'libcaffe2_detectron_custom_ops_gpu.so' failed make[2]: *** [libcaffe2_detectron_custom_ops_gpu.so] Error 1 CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all' failed make[1]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all] Error 2 Makefile:129: recipe for target 'all' failed make: *** [all] Error 2
Did you encounter this as well in your process?

I'm on Ubuntu 1604, using Cuda 10.1 and Cudnn 7.6.4. Cmake is at 3.13.3 and caffe2 version 1.3.

Any ideas would be more than welcomed, and if I made this question incorrectly, please let me know and I'll put it in the correct format.

Many thanks fellow developers.

I have no idea but it seems from the error message cannot find -lcaffe2_gpu_library that your Caffe2 library was not installed properly. Can you pass the Caffe2 Installation Test in your current environment?

Answer 7 · 2019-10-11T09:42:27.000Z

Hi @linkinpark213, Thanks for your response and your extensive work on this project as well!

Yes please, output is as follows:

I looked through the Caffe2Config.cmake, Caffe2ConfigVersion.cmake, Caffe2Targets.cmake, Caffe2Targets-release.cmake, and I wasn't seeing the line that @Johnqczhang had mentioned in his earlier comment.

I'm afraid I build Cuda and Cudnn from the source, as well as protobuf.

Answer 8 · 2019-10-11T09:44:49.000Z

@linkinpark213 btw, I'm using pip3 and python3.

I'm not using conda at all as that was coming into conflict with pytorch cmake dependencies.

Answer 9 · 2019-10-12T02:02:33.000Z

@mvmnt Well, though I still can't guess where your problem could be, here's good news that Detectron2 was just released on Oct 10th and a new version of DensePose based on Detectron2 and PyTorch is included. Simply running pip3 install torch torchvision would get you ready. Maybe it's time to say goodbye to Caffe2.

Answer 10 · 2019-10-14T08:53:23.000Z

Hey @linkinpark213,
Thanks for the response as usual, much appreciated!

This is more tragic than Ace's death...felt as though I got so close, but still can't manage to crack it.
I'll start another environment and try the new version.

By any chance however, might it be a problem with my CMakeLists.txt? I took this one and replaced the paths as instructed, is it possible that I might have gotten the paths wrong or something?

(my last desperate attempt at trying to find what's wrong)

Answer 11 · 2019-10-14T12:45:03.000Z

my python version is also 3.5.2. My pip3 is at 8.1.1.
If you need any further info at all from me, please don't hesitate to ask me, I'll help in whatever way I can.

Answer 12 · 2019-10-14T13:42:04.000Z

Hi @mvmnt, sorry for the late reply. I‘ve been working on developing DensePose based on PyTorch (more specially, maskrcnn-benchmark) for months and obtained much better baseline results. I'm sorry I cannot release my code right now for some reasons. But I believe my implementation should have much common with the recent released DensePose based on Detectron2. So you can try this first. Thanks!

Answer 13 · 2019-10-14T14:16:47.000Z

Hi guys,

Thanks so much for the support.

I'll give this new one a try then, and fingers crossed, it's easier to install than it's previous iterations 😅
Initially I was attempting to install this as a pre-requisite for another project called vid2vid so hopefully there's no compatibility issues with this new version and that project (one can only hope)

Answer 14 · 2019-10-15T02:10:09.000Z

Hi @mvmnt , I replicated your issue using Caffe2 1.3 (I built it from source). I then tried the 1.1 version instead and things worked well.

To build Caffe2 1.1 you can execute

git clone git@github.com:pytorch/pytorch.git
cd pytorch
git checkout 142c973f4179e768164cd578951489e89021b29c
git submodule sync
git submodule update --init --recursive
python setup.py install

Answer 15 · 2019-10-15T05:56:13.000Z

@mvmnt I also compared the library (*.so) files built from the source code of both versions. Here are what I found in Pytorch 1.1:

pytorch-1.1/torch/lib/libcaffe2.so
pytorch-1.1/torch/lib/libonnxifi_dummy.so
pytorch-1.1/torch/lib/libc10.so
pytorch-1.1/torch/lib/libtorch.so
pytorch-1.1/torch/lib/libshm.so
pytorch-1.1/torch/lib/libfoxi_dummy.so
pytorch-1.1/torch/lib/libfoxi.so
pytorch-1.1/torch/lib/libonnxifi.so
pytorch-1.1/torch/lib/libtorch_python.so
pytorch-1.1/torch/lib/libthnvrtc.so
pytorch-1.1/torch/lib/libcaffe2_module_test_dynamic.so
pytorch-1.1/torch/lib/libcaffe2_observers.so
pytorch-1.1/torch/lib/libcaffe2_gpu.so
pytorch-1.1/torch/lib/libcaffe2_detectron_ops_gpu.so
pytorch-1.1/torch/lib/libc10_cuda.so
pytorch-1.1/torch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so
pytorch-1.1/torch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state.cpython-37m-x86_64-linux-gnu.so

But in 1.3, there are only

pytorch-1.3/torch/lib/libcaffe2_nvrtc.so
pytorch-1.3/torch/lib/libc10.so
pytorch-1.3/torch/lib/libtorch.so
pytorch-1.3/torch/lib/libshm.so
pytorch-1.3/torch/lib/libtorch_python.so
pytorch-1.3/torch/lib/libcaffe2_module_test_dynamic.so
pytorch-1.3/torch/lib/libcaffe2_observers.so
pytorch-1.3/torch/lib/libcaffe2_detectron_ops_gpu.so
pytorch-1.3/torch/lib/libc10_cuda.so
pytorch-1.3/torch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so
pytorch-1.3/torch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state.cpython-37m-x86_64-linux-gnu.so

As you can see, the libcaffe2_gpu.so was gone in 1.3, which is why -lcaffe2_gpu_library was not found in your installation.

Answer 16 · 2019-10-15T13:01:53.000Z

Hi @linkinpark213,

Many thanks for your help here!

This is intriguing, I'll give it a try after work today and hopefully get lucky.

I've started looking into the new densepose version that you and @Johnqczhang suggested, and was planning on giving it a swing on the weekend to see if it plays well.

You guys are the best by far, even if it doesn't work, I'm grateful for you two, and everyone's support on this hunt!

Answer 17 · 2019-10-16T16:20:39.000Z

Hiya @linkinpark213,
Unfortunately, that did resolve that problem, however that version doesn't have gpu support, which I need 😰.
Found that out when I ran the second test that it gave me 0.
Also the densepose test fails also as a result of no gpu support.

So i've found that lib in that version, but loose gpu support. If i go with the latest release, it's missing the lib. Seems like a classic "rock and a hard place" situation 😄.

Answer 18 · 2019-10-16T18:17:50.000Z

Sorry for a late reaction. I think what worked for me in the end were things mentioned here:

pytorch/pytorch#15476

Specifically, I think I used the "bandaid solution". Sorry for not documenting what I did.

Answer 19 · 2019-10-17T09:36:01.000Z

Hi @WillemvJ,
I'm assuming you mean you implemented the sym link solution. I think this would be useful if the user has multiple installations of cuda on their system wouldn't it? My thought is that if there's only one version of Cuda on the user's environment (be it conda or normal pip) the libraries should point to that version that exists.

I've also seen in that post that it could be a result of the path being hardcoded in, if that is the case then that provides quite a pickle to solve, and the solution might be to just make a sym link.

I'll try the method mentioned as I cannot go with an older version of Caffe2 unfortunately, as my purpose requires gpu support, and the missing library. I'll be sure to keep this thread updated with my findings.

Answer 20 · 2019-11-01T10:29:31.000Z

Hey guys,

Unfortunately the method above that I tried, didn't work. It only lead me down a debugging path for the last 2 weeks....

I think I will have to try my hand at Detectron2 and see where this goes, and hopefully I'll have more luck there.

Thanks again however for the help!