Kernel overall local size
forreg16 opened this issue · 3 comments
Good afternoon.
When I use the NVIDIA GeForce RTX 3060 Ti graphics card in java-code, I get an error:
Kernel overall local size: 1000 exceeds maximum kernel allowed local size of: 256 failed
Running the same code on an Intel HD Graphics 630 or AMD RadeonT R7 450 graphics card, everything works fine.
If in this part of the code I put a number less than 256, then the code with the NVIDIA GeForce RTX 3060 Ti graphics card works fine:
Range range = needDevice.createRange(255);
kernel.execute(range)
The NVIDIA GeForce RTX 3060 Ti video card is more modern than the Intel HD Graphics 630 or AMD RadeonT R7 450, but for some reason the parameter for createRange is less than for older video cards.
What could be the problem?
Hey @forreg16 I have the same problem. My card is RTX 3070 and I run it on Linux.
The problem happens because the max group size is hardcoded to be 256:
public static final int MAX_OPENCL_GROUP_SIZE = 256;
I don't know why this is the max, I don't have experience with OpenCL.
I hope the maintainers of the project will answer here.
Maybe our option is to change the value and recompile the library but I don't know is there any instructions how to do that?
Hey @trayanmomkov.
Try this version of the code. In this case, my parameter size can be set to more than 256.
Range range = needDevice.createRange2D(size, 1);
kernel.execute(range);
you can see more details here
https://stackoverflow.com/questions/75365328/error-exceeds-maximum-kernel-allowed-local-size
But @forreg16 you can achieve that with create(size, localSize)
where localSize <= 256
and size % localSize == 0
.
The real problem is that localSize cannot be greater than 256.
On my card which has 5888 cores I want to have greater localSize to achieve better performance.
And actually Aparapi automatically chooses the localSize of 640 but when tries to set it I get the error:
!!!!!!! Kernel overall local size: 640 exceeds maximum kernel allowed local size of: 256 failed (null)