How to use Int8/Int16 capability support

Question

How to use Int8/Int16 capability support

Closed this issue 7 months ago · 9 comments

Question/Bug Report

I am trying to create a compute shader using rust-gpu and wgpu crates. Currently just testing whether I can output true/false from the GPU. I have run into problems utilizing u8/u16 types for output.

For the full code with wgpu running native on Nvidia GPU can be found at 149segolte/wgpu-dsp. This is the shader I am trying to compile:

#![cfg_attr(target_arch = "spirv", no_std)]
// HACK(eddyb) can't easily see warnings otherwise from `spirv-builder` builds.
// #![deny(warnings)]

use glam::UVec3;
use spirv_std::{glam, spirv};

#[allow(dead_code)]
pub struct Uniforms {
    max_seeds: u32,
    chunk: u32,
}

pub fn compute(_seed: u32) -> bool {
    return false;
}

// LocalSize/numthreads of (x = 4, y = 4, z = 2)
#[spirv(compute(threads(4, 4, 2)))]
pub fn compute_shader(
    #[spirv(num_workgroups)] num_workgroups: UVec3,
    #[spirv(workgroup_id)] workgroup_id: UVec3,
    #[spirv(local_invocation_index)] local_invocation_index: u32,
    #[spirv(uniform, descriptor_set = 0, binding = 0)] uniforms: &Uniforms,
    #[spirv(storage_buffer, descriptor_set = 0, binding = 1)] output: &mut [u8],
) {
    let work_group_index = workgroup_id.x * num_workgroups.y * num_workgroups.z
        + workgroup_id.z * num_workgroups.y
        + workgroup_id.y;
    let local_index = local_invocation_index;
    let global_index = work_group_index * 32 + local_index;
    let seed = global_index * (uniforms.chunk + 1);

    if global_index >= uniforms.max_seeds {
        return;
    }

    if compute(seed) {
        output[global_index as usize] = 1;
    } else {
        output[global_index as usize] = 0;
    }
}

For using u8 slice as output buffer, I enabled spirv_builder::Capability::Int8 in the build script before running the code. With max_seeds = 100_000_000, when compute function is set to return false from the shader, programs outputs:

Valid: 0, Invalid: 100000000

But changing compute to true gives:

Valid: 2080768, Invalid: 97919232

Expected output is:

Valid: 100000000, Invalid: 0

When run with validation layers, the code runs and gives the same output but also shows the following errors:

[2025-04-02T07:13:54Z ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-RuntimeSpirv-storageBuffer8BitAccess-06328 (0xbbd18a7e)]
    	vkCreateShaderModule(): SPIR-V contains an 8-bit OpVariable with StorageBuffer Storage Class, but storageBuffer8BitAccess was not enabled.
    %8 = OpVariable %24 12
    The Vulkan spec states: If storageBuffer8BitAccess is VK_FALSE, then objects containing an 8-bit integer element must not have Storage Class of StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer (https://vulkan.lunarg.com/doc/view/1.4.309.0/linux/antora/spec/latest/appendices/spirvenv.html#VUID-RuntimeSpirv-storageBuffer8BitAccess-06328)
[2025-04-02T07:13:54Z ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-VkShaderModuleCreateInfo-pCode-08740 (0x6e224e9)]
    	vkCreateShaderModule(): SPIR-V Capability Int8 was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan12Features::shaderInt8).
    The Vulkan spec states: If pCode is a pointer to SPIR-V code, and pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied (https://vulkan.lunarg.com/doc/view/1.4.309.0/linux/antora/spec/latest/chapters/shaders.html#VUID-VkShaderModuleCreateInfo-pCode-08740)
[2025-04-02T07:13:54Z ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-VkShaderModuleCreateInfo-pCode-08740 (0x6e224e9)]
    	vkCreateShaderModule(): SPIR-V Capability VulkanMemoryModel was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan12Features::vulkanMemoryModel).
    The Vulkan spec states: If pCode is a pointer to SPIR-V code, and pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied (https://vulkan.lunarg.com/doc/view/1.4.309.0/linux/antora/spec/latest/chapters/shaders.html#VUID-VkShaderModuleCreateInfo-pCode-08740)

Similarly with u16 return type and spirv_builder::Capability::Int16, false case gives:

Valid: 0, Invalid: 100000000

And true case gives:

Valid: 4161536, Invalid: 95838464

Validation layers give error:

[2025-04-02T07:42:59Z ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-RuntimeSpirv-storageBuffer16BitAccess-06331 (0x2df1cd5b)]
    	vkCreateShaderModule(): SPIR-V contains an 16-bit OpVariable with StorageBuffer Storage Class, but storageBuffer16BitAccess was not enabled.
    %8 = OpVariable %24 12
    The Vulkan spec states: If storageBuffer16BitAccess is VK_FALSE, then objects containing 16-bit integer or 16-bit floating-point elements must not have Storage Class of StorageBuffer, ShaderRecordBufferKHR, or PhysicalStorageBuffer (https://vulkan.lunarg.com/doc/view/1.4.309.0/linux/antora/spec/latest/appendices/spirvenv.html#VUID-RuntimeSpirv-storageBuffer16BitAccess-06331)
[2025-04-02T07:42:59Z ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-VkShaderModuleCreateInfo-pCode-08740 (0x6e224e9)]
    	vkCreateShaderModule(): SPIR-V Capability VulkanMemoryModel was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan12Features::vulkanMemoryModel).
    The Vulkan spec states: If pCode is a pointer to SPIR-V code, and pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied (https://vulkan.lunarg.com/doc/view/1.4.309.0/linux/antora/spec/latest/chapters/shaders.html#VUID-VkShaderModuleCreateInfo-pCode-08740)

System Info

Rust: 1.84.0-nightly (b19329a37 2024-11-21)
OS: Fedora 41 in a VM with GPU passthrough
GPU: NVIDIA GeForce RTX 2070 with Max-Q Design
SPIR-V: v2025.1 v2025.1.rc1-0-gf289d047
Adapter Info:

AdapterInfo {
    name: "NVIDIA GeForce RTX 2070 with Max-Q Design",
    vendor: 4318,
    device: 8016,
    device_type: DiscreteGpu,
    driver: "NVIDIA",
    driver_info: "565.77",
    backend: Vulkan,
}

Note:

I did not directly use u32 (it would not require any extra configuration) as it ran into maximum buffer size limits and I do not yet understand GPU programming enough to copy chunks of output buffer. Also u32 seems to big to just pass booleans across memory. I am looking into spirv_std::arch::atomic_or and spirv_std::memory::{Scope, Semantics} to use with u8 to reduce memory usage further.

Answer 1 · 2025-04-02T14:07:36.000Z

Thank you for trying the project and the extremely detailed report! I really appreciate the time you put into it 🍻 .

I'll look more deeply at your project later as I am not at a computer right now, but at first glance it looks like StorageBufferInt8 / StorageBufferInt16 might need to be enabled as well?

Answer 2 · 2025-04-02T17:25:12.000Z

Yeah, I have looked into it. I am confused about the differences between declaring spirv_builder::Capability and the wgpu::Features.

For Int8 I have tried the different combinations of the following:

SpirvBuilder::new(path, "spirv-unknown-vulkan1.2")
    .capability(spirv_builder::Capability::VulkanMemoryModel)
    .capability(spirv_builder::Capability::Int8)
    .capability(spirv_builder::Capability::StorageBuffer8BitAccess)
    .print_metadata(MetadataPrintout::Full)
    .build()?;

But when defining the adapter and device in wgpu, looking at docs for wgpu::Features it does not have any of the above 3 capabilities. So I just use:

&wgpu::DeviceDescriptor {
    label: None,
    required_features: wgpu::Features::SPIRV_SHADER_PASSTHROUGH,
    required_limits: wgpu::Limits::downlevel_defaults(),
    memory_hints: wgpu::MemoryHints::MemoryUsage,
},

For Int16 I use:

SpirvBuilder::new(path, "spirv-unknown-vulkan1.2")
    .capability(spirv_builder::Capability::VulkanMemoryModel)
    .capability(spirv_builder::Capability::Int16)
    .capability(spirv_builder::Capability::StorageBuffer16BitAccess)
    .print_metadata(MetadataPrintout::Full)
    .build()?;

On wgpu side we do have SHADER_I16 feature but none of the others. So I use:

&wgpu::DeviceDescriptor {
    label: None,
    required_features: wgpu::Features::SPIRV_SHADER_PASSTHROUGH
        | wgpu::Features::SHADER_I16,
    required_limits: wgpu::Limits::downlevel_defaults(),
    memory_hints: wgpu::MemoryHints::MemoryUsage,
},

I don't know how important actually defining required features for the GPU is, as I have used Vulkan Hardware Capability Viewer from the Vulkan SDK to see that my GPU does support all the Int8/Int16 features. And if so, then this very well might be missing from the wgpu side and not this library, as it does not have feature declaration for storage buffers access of this type.
But if declaring is optional, and my device already supports these features, then I am stilling getting incorrect output using spirv this way:

Valid: 4161536, Invalid: 95838464

Answer 3 · 2025-04-02T18:46:41.000Z

I haven't personally used these capabilities but I know @Firestar99 has...though he might be using vulkan directly? I'll poke around in a bit too.

Answer 4 · 2025-04-02T19:01:36.000Z

Tracing through the code, this is what wgpu enables (from this man page):

shaderInt16 specifies whether 16-bit integers (signed and unsigned) are supported in shader code. If this feature is not enabled, 16-bit integer types must not be used in shader code. This also specifies whether shader modules can declare the Int16 capability. However, this only enables a subset of the storage classes that SPIR-V allows for the Int16 SPIR-V capability: Declaring and using 16-bit integers in the Private, Workgroup (for non-Block variables), and Function storage classes is enabled, while declaring them in the interface storage classes (e.g., UniformConstant, Uniform, StorageBuffer, Input, Output, and PushConstant) is not enabled.

Answer 5 · 2025-04-02T19:08:05.000Z

I am not 100% yet, but it looks to me like it is a wgpu issue. When you set wgpu::Features::SHADER_I16, wgpu just sets shaderInt16 in vulkan, which lets you use i16 in your shaders (mainly local vars) but doesn't let you use most storage with it. wgpu does not appear to have support for specifying you want buffer support and plumbing VkPhysicalDevice16BitStorageFeatures through to vulkan ,even though rust-gpu supports it from the gpu code generation side.

So you can either add it to wgpu (likely pretty small and straightforward, adding something like wgpu::Features::STORAGE_16BIT...kinda annoying it applies to ints and floats) or use something like ash to use vulkan directly (if you don't need web).

Answer 6 · 2025-04-02T19:11:51.000Z

FWIW, wgpu is sort of "obfuscated mode" because there are 2 layers between you and vulkan (the webgpu api and their naga shader translation layer) and then they have things like passthrough to poke thorough parts. Most folks (including us!) use it because we need web support. But it makes tracking down issues very hard due to the many moving pieces until you get a handle on how it all fits together. So if you don't need web (or don't need to use their metal shader translation on mac/ios), might be easier in the long run to use ash or something like it directly (and it will be a little faster too).

Answer 7 · 2025-04-02T19:19:13.000Z

Alright, my testing is also leading me to believe that wgpu is the problematic part here. I initially used wgpu because I use a macbook for local testing and it had naga for that. But I guess I will first try to develop on my Nvidia GPU with ash and if it doesn't cause any problems, stick to it for now.
@LegNeato Thanks for all the help and quick diagnostics, really appreciate it. And also great library, trying to learning GLSL/WGSL has been difficult, being able to use rust directly is awesome.

Answer 8 · 2025-04-02T19:26:41.000Z

LLMs are pretty good at spitting out ash code, but it actually might be less coding to just patch wgpu if you are up to it. Good luck with your project, let us know if you hit more issues! 🚀

Answer 9 · 2025-04-03T11:29:23.000Z

I am not 100% yet, but it looks to me like it is a wgpu issue. When you set wgpu::Features::SHADER_I16, wgpu just sets shaderInt16 in vulkan, which lets you use i16 in your shaders (mainly local vars) but doesn't let you use most storage with it.

Interestingly, if you enable the SHADER_F16 feature, you get 16bit storage buffer access, but you don't if you just enable SHADER_I16:

            shader_float16: if requested_features.contains(wgt::Features::SHADER_F16) {
                Some((
                    vk::PhysicalDeviceShaderFloat16Int8Features::default().shader_float16(true),
                    vk::PhysicalDevice16BitStorageFeatures::default()
                        .storage_buffer16_bit_access(true)
                        .storage_input_output16(true)
                        .uniform_and_storage_buffer16_bit_access(true),
                ))
            } else {
                None
            },

https://github.com/gfx-rs/wgpu/blob/c6286791febc64cf8ef054b5356c2669327ef51c/wgpu-hal/src/vulkan/adapter.rs#L398

@149segolte could you try enabling both SHADER_F16 and SHADER_I16 features? I tried it locally and it gave me Valid: 4161536, Invalid: 95838464 but at least all the vulkan validation errors are gone (apart from VulkanMemoryModel which you can safely ignore)