Compute shader support & basic example
repi opened this issue · 6 comments
We are focusing first now on fragment & graphics shaders, and the binding for that. But do want and need to add support for basic compute shaders soon also. This is tracking that.
Features relevant to compute:
- LocalSize
- Buffer inputs / outputs
- Builtins like GlobalInvocationID / LocalInvocationID
- PushConstants
- Shared Workgroup Memory
- Memory Barriers
- Loops
- Math / Approx Ops
- Atomic Ops
With #195, you can compile and execute a compute shader, but it can't do anything.
Are these the math operations you want to support those found here? https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_C.html#math-functions
I don't know what math functions SPIR-V actually supports underneath, but I have a private repo I abandoned that had approximations of a lot of ei, e1, erf erfc, erfcinv erfinv. I just un-privated it so you can see, but I made it very early on when I was using rust, and I'm "newish" to rust development in general, so I probably didn't use best practices. But what it does have is several approximations for several of the main "special functions" (except lgamma an tgamma), with sources to where I got the implementations from. The original purpose ironically was to demo/benchmark possible approximations for functions missing from OpenCL for GLSL in rust, but found a way around using the special function I wanted to use.
Another resource I found that was great was this:
https://developer.download.nvidia.com/cg/cos.html
or well the rest of the math functions it showed on the side, some of them show example approximations used, not sure if any of the math ops you can't get builtins for are there.
When I get the time I would be willing to contribute to fixing any of these problems, but I don't really know what needs to be done to implement this, and I don't really understand the compilation process here yet.
Yes, I wasn't sure if those work in shaders generally or just kernels. OpenCL has FAST_RELAXED_MATH, I'm not sure if that can be specified some how.
In particular fma, exp, ln, sqrt, etc.
I would complain to Khronos group about the fast math stuff, though I think for most usecases it shouldn't matter. To better explain, I don't think there's a way to say don't optimize math in SPIR-V, and because of this, you get odd behavior like this. Essentially Nvidia is using FAST_RELAXED_MATH behind the scenes after consuming your SPIR-V code, at least in some instances.
This is a long standing issue that I've complained about for ever, listed on Vulkan surveys, and nothing has been done about it.
In particular fma, exp, ln, sqrt, etc.
Shouldn't you be able to get access to fma, exp, ln, sqrt etc through the graphics extended instruction sets? if I understand correctly, from here:
https://www.khronos.org/registry/spir-v/specs/unified1/GLSL.std.450.html#_introduction
with the exception of things like cbrt and other special functions.