Shark-ML/Remora

OpenCL error

Opened this issue · 12 comments

Is this because of the graphics card ?

naths@naths-HP-Laptop-15-bs1xx:~/build/remora/bin$ ./Benchmark_GPU_Conv2D
performance float
Flops
35 4 8 32 10391.1
67 4 8 32 12546.4
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
131 4 8 32 64964.4
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5

CNugteren/CLBlast#301
Its like it is not being resolved. Tried all options.

He reports that all regular CLBlast tests fail. And the OpenCL error -5 indicates CL_OUT_OF_RESOURCES, so I'm suspecting he wants to solve problems too large for his GPU memory. Do smaller problems work well?

Yeah i think i screwed up with the memory requirements on the Benchmark. Multiplying several small numbers might still lead to one very big one :)

i will fix once I am back from vacation

@Ulfgard any fix ?

should be fixed

This is what I get
a.) Fresh build of CLBlast
b.) Fresh build of Remora

naths@naths-HP-Laptop-15-bs1xx:~/build/remora/bin$ ./Benchmark_GPU_Conv2D
performance float
Flops
35 4 8 32 9193.43
67 4 8 32 11677.4
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
131 4 8 32 62958.7
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
^C

and I get this error also now
[ 60%] Built target LinearRegression
/home/naths/srcs/Remora/examples/Benchmarks/opencl_conv2d.cpp:38:26: error: use
of undeclared identifier 'out_gpu'
double storage = double(out_gpu.size1() * out_gpu.size2())/1024/1024;
^
/home/naths/srcs/Remora/examples/Benchmarks/opencl_conv2d.cpp:38:44: error: use
of undeclared identifier 'out_gpu'
double storage = double(out_gpu.size1() * out_gpu.size2())/1024/1024;

Are you sure you pulled and rebuild the opencl_conv22d test? because this is the old output before my changes (i am now also printing the approximate storage requirements of the output)

Now I cant even build it
naths@naths-HP-Laptop-15-bs1xx:~/build/remora$ make
[ 6%] Building CXX object examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/Benchmarks/opencl_conv2d.cpp.o
/home/naths/srcs/Remora/examples/Benchmarks/opencl_conv2d.cpp:38:26: error: use
of undeclared identifier 'out_gpu'
double storage = double(out_gpu.size1() * out_gpu.size2())/1024/1024;
^
/home/naths/srcs/Remora/examples/Benchmarks/opencl_conv2d.cpp:38:44: error: use
of undeclared identifier 'out_gpu'
double storage = double(out_gpu.size1() * out_gpu.size2())/1024/1024;
^
2 errors generated.
examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/build.make:62: recipe for target 'examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/Benchmarks/opencl_conv2d.cpp.o' failed
make[2]: *** [examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/Benchmarks/opencl_conv2d.cpp.o] Error 1
CMakeFiles/Makefile2:121: recipe for target 'examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/all' failed
make[1]: *** [examples/CMakeFiles/Benchmark_GPU_Conv2D.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

sorry for that. please rename out_gpu->out_opencl, some refactoring tool must have it gotten wrong.

now is the point i have to tell you that i will probably move the convolution out of Remora to shark, Or in a new repository with the other image-processing we implement

This is what I get
naths@naths-HP-Laptop-15-bs1xx:~/build/remora/bin$ ./Benchmark_GPU_Conv2D
performance float
im_size filtpx incChan OutChan memOut Flops
19 4 3 16 0.0220337 756.088
35 4 3 16 0.0747681 1200.29
67 4 3 16 0.273987 1178.22
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
131 4 3 16 1.04742 3035.47
19 8 3 16 0.0220337 1504.48
35 8 3 16 0.0747681 2378.29
67 8 3 16 0.273987 2819.58
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -5
131 8 3 16 1.04742 5916.42

I see. that could be the algorithm. we use an explicit matrix approach which can take quite a lot of memory. This is unfortunately not going to change until clBLAST has something better, This is how far we can provide.

I am currently working on hip-support, which should allow me to use cudnn for this on nvidia cards (and MIOpen for AMD) but as is aid: this is moved out of Remora.