Error using fnearneigh_gpu
Opened this issue · 6 comments
I am working on a huge dataset and having trouble getting the gpu script working. I was able to solve library issues, but I am stuck on this error:
TEprepare.m - line 934: Optimal tau for this dataset is: 0.39
TEprepare.m - line 946: Optimal dimension for this dataset is: 8
TEprepare.m - line 1056: Adding TEprepare structure to original data structure
InteractionDelayReconstruction_calculate.m - line 113:
############# OPTIMIZING INFORMATION TRANSFER DELAY
TEfindDelay.m - line 45: Estimating TE for u = 40 ms
Error using fnearneigh_gpu
Error detected in the GPU (check console)
Error in TEcallGPUsearch (line 295)
[~, distance_p21] = fnearneigh_gpu(single(pointset_p21),single(pointset_p21),k_th,TheilerT,nchunks);
...
The error reported in the console is:
invalid device function
I tried recompiling with the latest nvidia cuda, but no luck so far
Hi @dpdarrow ,
this seems to be a problem with your device, from googling around, I got that you may have to specify a different architecture when compiling the CUDA functions (-arch=compute_xx
and -code=sm_xx
). Did you try to change these settings when recompiling (you should be able to set them in the CUDA Makefile)?
Thanks. I was able to find improvements from passing the compute flags. I realized that I am getting errors from running make in the cuda directory. Unfortunately, I am getting a reference error:
/usr/bin/g++ -c -DMX_COMPAT_32 -D_GNU_SOURCE -DMATLAB_MEX_FILE -I"/usr/local/MATLAB/R2016a/extern/include" -I"/usr/local/MATLAB/R2016a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -std=c++11 -O -DNDEBUG /home/dpdarrow/matlab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_252588407813488_30845/fnearneigh_gpu.o
/usr/bin/g++ -pthread -Wl,--no-undefined -shared -O -Wl,--version-script,"/usr/local/MATLAB/R2016a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_252588407813488_30845/fnearneigh_gpu.o -lgpuKnnLibrary -lcudart -lcusparse -lcublas -L. -L/usr/local/cuda/lib64 -Wl,-rpath-link,/usr/local/MATLAB/R2016a/bin/glnxa64 -L"/usr/local/MATLAB/R2016a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
/tmp/mex_252588407813488_30845/fnearneigh_gpu.o: In functionmexFunction': fnearneigh_gpu.cpp:(.text+0x249): undefined reference to
cudaFindKnn(int_, float_, float_, float_, int, int, int, int, int)'
collect2: error: ld returned 1 exit statusI was able to adjust the compute code to 52 but this error arises either way. Thank you for your help! I will continue to attempt other solutions.
I saw the other post by @samuelandjw and found his linking fix of removing extern C from gpuKnnLibrary.cu, which seemed to solve the compilation issue.
Unfortunately, the gpu run has been seg faulting matlab, so I am hoping to track that down.
I finally determined that the major issue was kernel interruption from the OS on a nondedicated GPU last year. I upgraded to a new computer, and unfortunately, after fixing everything that had been a problem in the past, I am now receiving the same error.
TEfindDelay.m - line 45: Estimating TE for u = 10 msError using fnearneigh_gpu
Error detected in the GPU (check console)
Error in TEcallGPUsearch (line 313)
[~, distance_p21] = fnearneigh_gpu(single(pointset_p21),single(pointset_p21),k_th,TheilerT,nchunks);
Error in TEsurrogatestats_ensemble (line 880)
[ncount] = TEcallGPUsearch(cfg,channelpair,pointsets_concat_1,pointsets_concat_2, ...
Error in TEfindDelay (line 57)
TGA_results{uu}=TEsurrogatestats_ensemble(cfgTESS,data);
Error in InteractionDelayReconstruction_calculate (line 115)
[dataprep, TEmat] =
TEfindDelay(predicttimevec_u,cfgTESS,dataprep);
console: invalid device function
current cuda library is 8 and the dedicated Gpu is P5000
Any thoughts or help would be greatly appreciated