koide3/fast_gicp

cudaErrorIllegalAddress: an illegal memory access was encountered

felixmr1 opened this issue · 1 comments

After building the focal_cuda docker container and running the tests within it. I get this error:

[==========] Running 9 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from GICPTestBase
[ RUN      ] GICPTestBase.LoadCheck
[       OK ] GICPTestBase.LoadCheck (9 ms)
[----------] 1 test from GICPTestBase (9 ms total)

[----------] 8 tests from AlignmentTest2/AlignmentTest
[ RUN      ] AlignmentTest2/AlignmentTest.test/GICP_ST
[       OK ] AlignmentTest2/AlignmentTest.test/GICP_ST (317 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/GICP_MT
[       OK ] AlignmentTest2/AlignmentTest.test/GICP_MT (117 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/VGICP_ST
[       OK ] AlignmentTest2/AlignmentTest.test/VGICP_ST (211 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/VGICP_MT
[       OK ] AlignmentTest2/AlignmentTest.test/VGICP_MT (93 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/VGICP_CUDA_ST
/root/fast_gicp/src/test/gicp_test.cpp:164: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
FORWARD TEST
/root/fast_gicp/src/test/gicp_test.cpp:174: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
BACKWARD TEST
/root/fast_gicp/src/test/gicp_test.cpp:186: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
SWAP AND SET SOURCE TEST
/root/fast_gicp/src/test/gicp_test.cpp:198: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
SWAP AND SET TARGET TEST
[  FAILED  ] AlignmentTest2/AlignmentTest.test/VGICP_CUDA_ST, where GetParam() = (0x55a2afbf1209 pointing to "VGICP_CUDA", false) (5579 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/VGICP_CUDA_MT
/root/fast_gicp/src/test/gicp_test.cpp:164: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
FORWARD TEST
/root/fast_gicp/src/test/gicp_test.cpp:174: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
BACKWARD TEST
/root/fast_gicp/src/test/gicp_test.cpp:186: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
SWAP AND SET SOURCE TEST
/root/fast_gicp/src/test/gicp_test.cpp:198: Failure
Expected: (errors[0]) < (t_tol), actual: 0.497354 vs 0.05
SWAP AND SET TARGET TEST
[  FAILED  ] AlignmentTest2/AlignmentTest.test/VGICP_CUDA_MT, where GetParam() = (0x55a2afbf1209 pointing to "VGICP_CUDA", true) (69 ms)
[ RUN      ] AlignmentTest2/AlignmentTest.test/NDT_CUDA_ST
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  parallel_for failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)

Has anyone gotten the same?

Output of nvidia-smi on host computer:

Thu Oct 26 15:35:02 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01             Driver Version: 535.113.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P8               9W /  40W |    603MiB /  4096MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      3379      G   /usr/lib/xorg/Xorg                          373MiB |
|    0   N/A  N/A      3508      G   /usr/bin/gnome-shell                         93MiB |
|    0   N/A  N/A    475224      G   x-terminal-emulator                           8MiB |
|    0   N/A  N/A    479320      G   x-terminal-emulator                           8MiB |
|    0   N/A  N/A    488622    C+G   ...8017866,16359961026231572468,262144       14MiB |
|    0   N/A  N/A    717283      G   x-terminal-emulator                           8MiB |
|    0   N/A  N/A    725490      G   x-terminal-emulator                           8MiB |
|    0   N/A  N/A    737069      G   x-terminal-emulator                           8MiB |
+---------------------------------------------------------------------------------------+

My OS is pop_OS 22.04 - which essentially is ubuntu 22.04

Should be fixed by: #137