SYCL Streams unit test fails on current main branch
mrnorman opened this issue · 6 comments
On current main branch, hash: d29e739
qsub -I -t 30 -n 1 -q florentia_debug
source jlse_gpu_O3.sh
make -j
make test
[ac.normanmr@florentia02:~/YAKL/unit/build/machines/jlse] >:O ./Streams/Streams
Running on Intel(R) Graphics [0x0bd5]
1
YAKL FATAL ERROR:
ERROR: val1 is wrong
terminate called after throwing an instance of 'char const*'
Aborted
Also, if -DYAKL_ENABLE_STREAMS
is removed from the flags, we get a segmentation fault, and that needs to be fixed as well.
I did see this error using the default runtime and modules you were loading. Fortunately, the experimental runtime and SDK that I've used to test the multi-stream fixed this issue. Will put down the details here for tracking and can close it when the official SDK fixes it.
Thanks!
Sorry about the delay. The test works fine both with the default SDK and also the experimental SDK as shown below from the logs. I am looking into the reason why the stream test fails (i.e., segfaults when not using -DYAKL_ENABLE_STREAMS). Hope this helps.
With the latest compiler + drivers on Sunspot (the multi-stream test passes as expected)
sunspot_build_latest_module
#!/bin/bash
module purge
module use /soft/testing/modulefiles/
module load intel-UMD23.05.25593.11/23.05.25593.11
module load dpcpp-master
module load spack cmake
module list
../../cmakeclean.sh
unset GATOR_DISABLE
export CC=`which clang`
export CXX=`which clang++`
export FC=`which gfortran`
unset CXXFLAGS
unset FFLAGS
cmake -DYAKL_ARCH="SYCL" \
-DYAKL_SYCL_FLAGS="-O3 -DYAKL_ENABLE_STREAMS" \
-DCMAKE_CXX_FLAGS="-O3 -fsycl -sycl-std=2020 -fsycl-unnamed-lambda -fsycl-device-code-split=per_kernel -fsycl-targets=spir64_gen -Xsycl-target-backend \"-device 12.60.7\"" \
-DYAKL_F90_FLAGS="-O3" \
-DYAKL_C_FLAGS="-O3" \
../../..
make -j
ctest --no-tests=error
Test log for the above build
Test project /lus/gila/projects/CSC249ADSE15_CNDA/abagusetty/yakl_stream/unit/build/machines/jlse
Start 1: CArray_test
1/17 Test #1: CArray_test ...................... Passed 0.10 sec
Start 2: FArray_test
2/17 Test #2: FArray_test ...................... Passed 0.08 sec
Start 3: Gator_test
3/17 Test #3: Gator_test ....................... Passed 0.10 sec
Start 4: Random_test
4/17 Test #4: Random_test ...................... Passed 0.07 sec
Start 5: FFT_test
5/17 Test #5: FFT_test ......................... Passed 2.24 sec
Start 6: Reductions_test
6/17 Test #6: Reductions_test .................. Passed 0.10 sec
Start 7: Atomics_test
7/17 Test #7: Atomics_test ..................... Passed 0.07 sec
Start 8: Pentadiagonal_test
8/17 Test #8: Pentadiagonal_test ............... Passed 0.01 sec
Start 9: Tridiagonal_test
9/17 Test #9: Tridiagonal_test ................. Passed 0.01 sec
Start 10: Lambda_test
10/17 Test #10: Lambda_test ...................... Passed 0.06 sec
Start 11: Fortran_Link_test
11/17 Test #11: Fortran_Link_test ................Subprocess aborted***Exception: 0.29 sec
Start 12: Fortran_Gator_test
12/17 Test #12: Fortran_Gator_test ............... Passed 0.11 sec
Start 13: OpenMP_Regions_test
13/17 Test #13: OpenMP_Regions_test .............. Passed 0.06 sec
Start 14: Intrinsics_test
14/17 Test #14: Intrinsics_test .................. Passed 0.09 sec
Start 15: ParForC_test
15/17 Test #15: ParForC_test ..................... Passed 0.06 sec
Start 16: ParForFortran_test
16/17 Test #16: ParForFortran_test ............... Passed 0.06 sec
Start 17: Streams_test
17/17 Test #17: Streams_test ..................... Passed 2.94 sec
94% tests passed, 1 tests failed out of 17
Total Test time (real) = 6.50 sec
The following tests FAILED:
11 - Fortran_Link_test (Subprocess aborted)
Errors while running CTest
Output from these tests are in: /lus/gila/projects/CSC249ADSE15_CNDA/abagusetty/yakl_stream/unit/build/machines/jlse/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
Using the defaults: (jlse_gpu_O3_AoT_PVC.sh
) The multi-stream fails with the default SDK which is as expected.
Test log with the default SDK
abagusetty@x1921c0s2b0n0 /lus/gila/projects/CSC249ADSE15_CNDA/abagusetty/yakl_stream/unit/build/machines/jlse (sycl_stream_fortranlink) $ ./Streams/Streams
Running on Intel(R) Graphics [0x0bd6]
3
5
Pool Memory High Water Mark: 1610612736
Pool Memory High Water Efficiency: 0.75
All the above tests
Current main still fails for me on JLSE florentia-debug node using jlse_gpu_O3.sh