Using a smaller block size

Question

Using a smaller block size

pclucas14 opened this issue 2 years ago · 0 comments

Hi,

First of all thanks for setting up this package :) It's super helpful, thanks

I'm wondering, is there a way to use a smaller block size ? I tried modifying the python code so that no errors are thrown, however I'm hitting a

RuntimeError: CUDA error: an illegal memory access was encountered

error when calling the cuda kernel. I tried to look a bit into the kernel code, and it seems that the block_size argument is not used. So I'm curious how the kernel knows to expect a minimal size of 32.

Any clarifications would be super helpful!

Thanks