google/uVkCompute

Why large loop count will cause problem on integrated gpu?

yangfengzzz opened this issue · 0 comments

I have a problem when writing gpgpu code by using vulkan. I don't know where to ask this question, so I put here to seek a answer.

This problem is from https://github.com/google/uVkCompute/tree/main/benchmarks/compute
I try to do the same benchmark test on my own vulkan framework. I found in integrated gpu like intel gpu, when kLoopSize is very large, the result will be wrong. But when I reduce the operation count(only 4 operations in one loop), It will work well too.
This example work well in discrete gpu like AMD and NVIDIA.

uVKCompute work well in both of them, why it would happend? It's hard to understand. I found the only different in pipeline is that I don't recreate command buffer but reuse it in command pool. But I don't think it will cause that difference.