XanaduAI/jet

GPU code fails to compile for provided CMake examples

Closed this issue · 1 comments

mlxd commented

Bug description

With a recent change to CudaTensor.hpp (#50 ) we can no longer compile GPU support without explicitly using nvcc. This also brings many unwanted side-effects due to the level of C++17 support in nvcc. As such, the cuTENSOR enabled tests and benchmark examples no longer compile or run using the provided CMake scripts.

  • Expected behavior: The cuTENSOR enabled code should run and pass all tests, and all examples should be runnable.

  • Actual behavior: The cuTENSOR enabled code fails to compile for the given test-cases, and all benchmark examples using GPU support fail to compile.

  • Reproduces how often: Always using CMake on a GPU-enabled system with -DENABLE_CUTENSOR=1 .

  • System information: Linux, AMD64, nvcc 11.3, cuTensor 1.3.1, g++ 10.2.0/10.3.0/11.1.0, GTX 1060

Source code and backtraces

Build command:

cmake . -BBuild -DENABLE_NATIVE=1 -DENABLE_OPENMP=1 -DENABLE_CUTENSOR=1 -DBUILD_TESTS=1
cmake --build ./Build

Output:

Consolidate compiler generated dependencies of target test_cutensor
[  9%] Building CXX object test/CMakeFiles/test_cutensor.dir/Test_CudaTensor.cpp.o
In file included from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:4,
                 from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_flow.hpp:4,
                 from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cudaflow.hpp:9,
                 from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:22,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_optimizer.hpp: In member function ‘std::vector<tf::cudaNode*> tf::cudaCapturingBase::_toposort(tf::cudaGraph&)’:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_optimizer.hpp:56:11: error: unused variable ‘hu’ [-Werror=unused-variable]
   56 |     auto& hu = std::get<cudaNode::Capture>(u->_handle);
      |           ^~
In file included from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_flow.hpp:4,
                 from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cudaflow.hpp:9,
                 from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:22,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:686:8: error: expected primary-expression before ‘<’ token
  686 |     f<<<g, b, s, stream>>>(args...);
      |        ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:686:26: error: expected primary-expression before ‘>’ token
  686 |     f<<<g, b, s, stream>>>(args...);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:686:32: error: expected ‘)’ before ‘...’ token
  686 |     f<<<g, b, s, stream>>>(args...);
      |                           ~    ^~~
      |                                )
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:795:8: error: expected primary-expression before ‘<’ token
  795 |     f<<<g, b, s, stream>>>(args...);
      |        ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:795:26: error: expected primary-expression before ‘>’ token
  795 |     f<<<g, b, s, stream>>>(args...);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_capturer.hpp:795:32: error: expected ‘)’ before ‘...’ token
  795 |     f<<<g, b, s, stream>>>(args...);
      |                           ~    ^~~
      |                                )
In file included from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cudaflow.hpp:10,
                 from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:22,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In function ‘void tf::cuda_for_each(I, size_t, F)’:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:25:14: error: ‘blockIdx’ was not declared in this scope
   25 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |              ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:25:25: error: ‘blockDim’ was not declared in this scope
   25 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                         ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:25:38: error: ‘threadIdx’ was not declared in this scope
   25 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                                      ^~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In function ‘void tf::cuda_for_each_index(I, I, size_t, F)’:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:38:14: error: ‘blockIdx’ was not declared in this scope
   38 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |              ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:38:25: error: ‘blockDim’ was not declared in this scope
   38 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                         ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:38:38: error: ‘threadIdx’ was not declared in this scope
   38 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                                      ^~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:134:26: error: expected primary-expression before ‘<’ token
  134 |     cuda_for_each<I, C><<<(N+B-1)/B, B, 0, stream>>>(first, N, c);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:134:52: error: expected primary-expression before ‘>’ token
  134 |     cuda_for_each<I, C><<<(N+B-1)/B, B, 0, stream>>>(first, N, c);
      |                                                    ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:151:32: error: expected primary-expression before ‘<’ token
  151 |     cuda_for_each_index<I, C><<<(N+B-1)/B, B, 0, stream>>>(beg, inc, N, c);
      |                                ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:151:58: error: expected primary-expression before ‘>’ token
  151 |     cuda_for_each_index<I, C><<<(N+B-1)/B, B, 0, stream>>>(beg, inc, N, c);
      |                                                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:164:26: error: expected primary-expression before ‘<’ token
  164 |     cuda_for_each<I, C><<<(N+B-1)/B, B, 0, stream>>>(first, N, c);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:164:52: error: expected primary-expression before ‘>’ token
  164 |     cuda_for_each<I, C><<<(N+B-1)/B, B, 0, stream>>>(first, N, c);
      |                                                    ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:183:32: error: expected primary-expression before ‘<’ token
  183 |     cuda_for_each_index<I, C><<<(N+B-1)/B, B, 0, stream>>>(beg, inc, N, c);
      |                                ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:183:58: error: expected primary-expression before ‘>’ token
  183 |     cuda_for_each_index<I, C><<<(N+B-1)/B, B, 0, stream>>>(beg, inc, N, c);
      |                                                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:191:26: error: expected primary-expression before ‘<’ token
  191 |     cuda_single_task<C><<<1, 1, 0, stream>>>(c);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:191:44: error: expected primary-expression before ‘>’ token
  191 |     cuda_single_task<C><<<1, 1, 0, stream>>>(c);
      |                                            ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:199:26: error: expected primary-expression before ‘<’ token
  199 |     cuda_single_task<C><<<1, 1, 0, stream>>>(c);
      |                          ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_for_each.hpp:199:44: error: expected primary-expression before ‘>’ token
  199 |     cuda_single_task<C><<<1, 1, 0, stream>>>(c);
      |                                            ^
In file included from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cudaflow.hpp:11,
                 from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:22,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp: In function ‘void tf::cuda_transform(I, size_t, F, S ...)’:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:15:14: error: ‘blockIdx’ was not declared in this scope
   15 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |              ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:15:25: error: ‘blockDim’ was not declared in this scope
   15 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                         ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:15:38: error: ‘threadIdx’ was not declared in this scope
   15 |   size_t i = blockIdx.x*blockDim.x + threadIdx.x;
      |                                      ^~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:70:33: error: expected primary-expression before ‘<’ token
   70 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                 ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:70:59: error: expected primary-expression before ‘>’ token
   70 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                                           ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:70:78: error: expected ‘)’ before ‘...’ token
   70 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                                            ~                 ^~~
      |                                                                              )
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:86:33: error: expected primary-expression before ‘<’ token
   86 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                 ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:86:59: error: expected primary-expression before ‘>’ token
   86 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                                           ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_transform.hpp:86:78: error: expected ‘)’ before ‘...’ token
   86 |     cuda_transform<I, C, S...><<<(N+B-1)/B, B, 0, stream>>>(first, N, c, srcs...);
      |                                                            ~                 ^~~
      |                                                                              )
In file included from /home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cudaflow.hpp:12,
                 from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:22,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp: In function ‘void tf::cuda_reduce(I, size_t, T*, C)’:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:28:16: error: ‘threadIdx’ was not declared in this scope
   28 |   size_t tid = threadIdx.x;
      |                ^~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:36:20: error: ‘blockDim’ was not declared in this scope
   36 |   for(size_t i=tid+blockDim.x; i<N; i+=blockDim.x) {
      |                    ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:40:3: error: there are no arguments to ‘__syncthreads’ that depend on a template parameter, so a declaration of ‘__syncthreads’ must be available [-fpermissive]
   40 |   __syncthreads();
      |   ^~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:40:3: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:42:18: error: ‘blockDim’ was not declared in this scope
   42 |   for(size_t s = blockDim.x / 2; s > 32; s >>= 1) {
      |                  ^~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:46:5: error: there are no arguments to ‘__syncthreads’ that depend on a template parameter, so a declaration of ‘__syncthreads’ must be available [-fpermissive]
   46 |     __syncthreads();
      |     ^~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:188:34: error: expected primary-expression before ‘<’ token
  188 |     cuda_reduce<I, T, C, false><<<1, B, B*sizeof(T), stream>>>(
      |                                  ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:188:62: error: expected primary-expression before ‘>’ token
  188 |     cuda_reduce<I, T, C, false><<<1, B, B*sizeof(T), stream>>>(
      |                                                              ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:206:33: error: expected primary-expression before ‘<’ token
  206 |     cuda_reduce<I, T, C, true><<<1, B, B*sizeof(T), stream>>>(
      |                                 ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:206:61: error: expected primary-expression before ‘>’ token
  206 |     cuda_reduce<I, T, C, true><<<1, B, B*sizeof(T), stream>>>(
      |                                                             ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:224:34: error: expected primary-expression before ‘<’ token
  224 |     cuda_reduce<I, T, C, false><<<1, B, B*sizeof(T), stream>>>(
      |                                  ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:224:62: error: expected primary-expression before ‘>’ token
  224 |     cuda_reduce<I, T, C, false><<<1, B, B*sizeof(T), stream>>>(
      |                                                              ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp: In lambda function:
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:242:33: error: expected primary-expression before ‘<’ token
  242 |     cuda_reduce<I, T, C, true><<<1, B, B*sizeof(T), stream>>>(
      |                                 ^
/home/mlxd/BugFixes/Ref/jet/Build/_deps/taskflow-src/taskflow/cuda/cuda_algorithm/cuda_reduce.hpp:242:61: error: expected primary-expression before ‘>’ token
  242 |     cuda_reduce<I, T, C, true><<<1, B, B*sizeof(T), stream>>>(
      |                                                             ^
In file included from /home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:11,
                 from /home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:10:
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp: In destructor ‘Jet::CudaTensor<T, CUDA_DEVICE>::~CudaTensor()’:
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:50: error: expected primary-expression before ‘ctx’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |                                                  ^~~
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:11: note: in definition of macro ‘JET_ABORT_IF_NOT’
   31 |     if (!(expression)) {                                                       \
      |           ^~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:50: error: expected ‘)’ before ‘ctx’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |                                                  ^~~
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:11: note: in definition of macro ‘JET_ABORT_IF_NOT’
   31 |     if (!(expression)) {                                                       \
      |           ^~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:10: note: to match this ‘(’
   31 |     if (!(expression)) {                                                       \
      |          ^
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensorHelpers.hpp:26:5: note: in expansion of macro ‘JET_ABORT_IF_NOT’
   26 |     JET_ABORT_IF_NOT(err == cudaSuccess, cudaGetErrorString(err))
      |     ^~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:66: error: expected ‘)’ before ‘;’ token
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |                                                                  ^
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:11: note: in definition of macro ‘JET_ABORT_IF_NOT’
   31 |     if (!(expression)) {                                                       \
      |           ^~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:8: note: to match this ‘(’
   31 |     if (!(expression)) {                                                       \
      |        ^
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensorHelpers.hpp:26:5: note: in expansion of macro ‘JET_ABORT_IF_NOT’
   26 |     JET_ABORT_IF_NOT(err == cudaSuccess, cudaGetErrorString(err))
      |     ^~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:66: error: suggest braces around empty body in an ‘if’ statement [-Werror=empty-body]
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |                                                                  ^
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:11: note: in definition of macro ‘JET_ABORT_IF_NOT’
   31 |     if (!(expression)) {                                                       \
      |           ^~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:21: error: expected ‘;’ before ‘)’ token
   31 |     if (!(expression)) {                                                       \
      |                     ^
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensorHelpers.hpp:26:5: note: in expansion of macro ‘JET_ABORT_IF_NOT’
   26 |     JET_ABORT_IF_NOT(err == cudaSuccess, cudaGetErrorString(err))
      |     ^~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp: In instantiation of ‘Jet::CudaTensor<T, CUDA_DEVICE>::~CudaTensor() [with T = float2; int CUDA_DEVICE = 0]’:
/home/mlxd/BugFixes/Ref/jet/test/Test_CudaTensor.cpp:54:20:   required from here
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensorHelpers.hpp:26:26: error: value computed is not used [-Werror=unused-value]
   26 |     JET_ABORT_IF_NOT(err == cudaSuccess, cudaGetErrorString(err))
/home/mlxd/BugFixes/Ref/jet/include/jet/Abort.hpp:31:11: note: in definition of macro ‘JET_ABORT_IF_NOT’
   31 |     if (!(expression)) {                                                       \
      |           ^~~~~~~~~~
/home/mlxd/BugFixes/Ref/jet/include/jet/CudaTensor.hpp:215:9: note: in expansion of macro ‘JET_CUDA_IS_SUCCESS’
  215 |         JET_CUDA_IS_SUCCESS(tf::cudaScopedDevice ctx(CUDA_DEVICE);
      |         ^~~~~~~~~~~~~~~~~~~
make[2]: *** [test/CMakeFiles/test_cutensor.dir/build.make:76: test/CMakeFiles/test_cutensor.dir/Test_CudaTensor.cpp.o] Interrupt
make[1]: *** [CMakeFiles/Makefile2:890: test/CMakeFiles/test_cutensor.dir/all] Interrupt
make: *** [Makefile:146: all] Interrupt

Additional information

Any additional information, configuration or data that might be necessary to reproduce the issue.

Thanks for writing up the issue! I guess this is what happens when you don't have proper CI in place for GPU code.

We can make an 0.2.1 release with the relevant patch when its ready; however, until then, the GPU benchmarks should be run using Jet at commit d74502b.