Invalid device function
ranjeethks opened this issue · 6 comments
Hi I use Torch7, OpenCV3, png++ (0.2.9) and libpng1.6 on an Ubuntu 16.04
I get following error when I run
./main.lua kitti fast -a predict -net_fname net/net_kitti_fast_-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70
kitti fast -a predict -net_fname net/net_kitti_fast_-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70
luajit: /home/ubuntu/mc-cnn/Normalize2.lua:11: invalid device function
stack traceback:
[C]: in function 'Normalize_forward'
/home/ubuntu/mc-cnn/Normalize2.lua:11: in function 'updateOutput'
./main.lua:911: in function 'forward_free'
./main.lua:945: in function 'stereo_predict'
./main.lua:1101: in main chunk
[C]: at 0x00405d50
What could be the issue
hi, I'm with the same problem, have you found a solution?
tks.
any luck on this issue? I have the same.
Not yet. I'm trying to reproduce some results of the Kitti Vision Benchmark (http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo) and I'm having this same 'invalid device function' error in two methods: MC-CNN-acrt (the same error of ranjeethks) and in L-ResMatch.
In L-ResMatch when I run:
scripts/preprocess_kitti.lua -color rgb -storage storage
I have the error:
luajit: scripts/preprocess_kitti.lua:113: invalid device function
stack traceback:
[C]: in function 'remove_nonvisible'
scripts/preprocess_kitti.lua:113: in main chunk
[C]: at 0x00405d50
I think my problem is memory and/or version. How should we know about which CUDA/Torch version he used in this code?
I solved my problem. Actually, since I use a cluster, I was not submitting my job to the cluster. That's why I got this problem.
I solved my problem too.
The problem was that I had not set the correct CUDA Compute Capability according to my GPU on the Makefiles of both projects.
My GPU is a Quadro K1100M, that has CUDA Compute Capability 3.0. So, I had to change on the Makefiles of my projects the parameter sm_35 to sm_30 (sm_35 means cuda compute capability 3.5 and so on).
A table of Cuda Compute Capability of the GPUs can be found here:
https://developer.nvidia.com/cuda-gpus
Just to be more precise, I changed the following lines on the Makefile of the projects:
---------- In L-ResMatch project:
libadcensus.so: src/adcensus.cu
$(CUDA)/bin/nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libadcensus.so --shared src/adcensus.cu
libcuresmatch.so: src/curesmatch.cu
$(CUDA)/bin/nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libcuresmatch.so --shared src/curesmatch.cu
---------- In MC-CNN-acrt project:
libadcensus.so: adcensus.cu SpatialLogSoftMax.cu
nvcc -arch sm_35 -O3 -DNDEBUG --compiler-options '-fPIC' -o libadcensus.so --shared adcensus.cu
Just change the sm_xx to the correct one based on your videocard.
After that, it worked perfectly. (I used Cuda 8 to reproduce the code)
=]