gfx-rs/wgpu-rs

Vulkan validation errors on wgpu-rs 0.7

AndiHofi opened this issue · 14 comments

Hi there,

I am writing an application in wgpu-rs and I am currently trying to move from 0.6 to 0.7.

Code migration was relatively straight-forward, but when trying to draw anything, the Vulkan validation complains a lot.

When trying to create multiple render passes per frame, the application crashes.

I do not know, if this is a bug, or if I just make a really stupid mistake.

These are the Vulkan messages. Unfortunately, I don't know how to control those bits on the framebuffer and render pass:

[2021-02-13T15:30:46Z ERROR gfx_backend_vulkan] 
    VALIDATION [VUID-VkRenderPassBeginInfo-framebuffer-03209 (0)] : VkRenderPassBeginInfo: Image view #0 created from an image with flags set as 0x400, but image info #0 used to create the framebuffer had flags set as 0x0 The Vulkan spec states: If framebuffer was created with a VkFramebufferCreateInfo::flags value that included VK_FRAMEBUFFER_CREATE_IMAGELESS_BIT, each element of the pAttachments member of a VkRenderPassAttachmentBeginInfo structure included in the pNext chain must be a VkImageView of an image created with a value of VkImageCreateInfo::flags equal to the flags member of the corresponding element of VkFramebufferAttachmentsCreateInfoKHR::pAttachments used to create framebuffer (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-VkRenderPassBeginInfo-framebuffer-03209)
    object info: (type: RENDER_PASS, hndl: 103354093011038)
    
[2021-02-13T15:30:46Z ERROR gfx_backend_vulkan] 
    VALIDATION [VUID-VkPresentInfoKHR-pImageIndices-01296 (0)] : Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
    object info: (type: QUEUE, hndl: 94404108469536)

Using Rust 1.48.0 on Pop!_OS 20.04 LTS.

The hopefully relevant Cargo dependencies are:

winit = { version = "0.24", features = ["serde"] }

wgpu = "0.7.*"
wgpu-core = "0.7.*"
wgpu-types = "0.7.*"

imgui = "0.7.*"
imgui-wgpu = "0.14.*"
imgui-winit-support = "0.7.*"

I do create the swap chain and everything before starting the winit event loop.

Tried to change all the flags that wgpu-rs exposes on the SwapChainDescriptor and on the RenderPass, but the error message does not change at all.

The crash happens whenever a second RenderPass is created per frame, even when the CommandBuffer is not submitted to the queue....

How my SwapChain is configured:
(original setup was somewhat faithfully copied from the awesome https://sotrh.github.io/learn-wgpu/ guide)

let sc_desc = wgpu::SwapChainDescriptor {
    usage: wgpu::TextureUsage::RENDER_ATTACHMENT,
    format: wgpu::TextureFormat::Bgra8UnormSrgb,
    width: size.width,
    height: size.height,
    present_mode: wgpu::PresentMode::Fifo,
};
let swap_chain = device.create_swap_chain(surface, &sc_desc);

Rendering:

match state.swap_chain.get_current_frame() {
    Ok(frame) => {
        let attachment = &frame.output.view;
        let mut encoder = render_globals.device.create_command_encoder(
            &wgpu::CommandEncoderDescriptor {
                label: Some("Encoder"),
            },
        );

        let render_pass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
            label: None,
            color_attachments: &[
                wgpu::RenderPassColorAttachmentDescriptor {
                    attachment,
                    None,
                    ops: wgpu::Operations {
                        load: wgpu::LoadOp::Clear(wgpu_types::Color::WHITE),
                        store: true,
                    },
                },
            ],
            depth_stencil_attachment: None,
        });
        
        // it does not really matter what I do with the render_pass... errors stay the same
        // So imgui may not be relevant
        
        std::mem::drop(render_pass);

        let overlay_render = encoder.finish();
    
        queue.submit(vec![overlay_render]);
    }
    _ => 
        // omitting error handling for brevity - never reaches this anyways
};
kvark commented

The first validation error is a bug in VVL - KhronosGroup/Vulkan-ValidationLayers#2502
The second one is unexpected. Given that your code is so straightforward, it doesn't look much different from our examples. Could you run the examples? Do they result in this validation error as well for you?

The examples fail as well.

The cube example runs, but shows the same two validation errors for each frame.
The water and texture-arrays examples crash immediately:

 ~/d/wgpu-rs | v0.7 >>  cargo run --example texture-arrays                                                                                                                                                                 Sa 13 Feb 2021 19:27:46
    Finished dev [unoptimized + debuginfo] target(s) in 0.06s
     Running `target/debug/examples/texture-arrays`
Using AMD RADV NAVI10 (ACO) (Vulkan)
fish: “cargo run --example texture-arr…” terminated by signal SIGSEGV (Address boundary error)

Neither RUST_BACKTRACE=1 nor RUSTL_LOG=trace give much more info. It ends directly after:

[0.173698 INFO](Device::create_swap_chain)(wgpu_core::device): creating swap chain SwapChainDescriptor { usage: RENDER_ATTACHMENT, format: Bgra8UnormSrgb, width: 800, height: 600, present_mode: Mailbox }
[0.176212 TRACE](CommandEncoder::run_render_pass)(wgpu_core::command::render): Encoding render pass begin in command buffer (0, 1, Vulkan)

The examples on branch 0.6 work fine.

kvark commented

I'll double check this, thank you for the info!
How about master branch? Does stuff run there any differently?

I'll check on master. Created some core dumps and backtraces in the meanwhile. Never did such a thing before, so it took a while :)

Actually, the water example works when I turn validation off.

0.7 texture-arrays without validation (release build)

(gdb) bt
#0  0x00007f8801372105 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#1  0x00007f88011b78fd in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#2  0x00007f88011add94 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#3  0x00007f88011b1662 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#4  0x00007f88011b40c4 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#5  0x00007f88011b41c6 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#6  0x000055ff5832a4e1 in gfx_backend_vulkan::device::<impl gfx_hal::device::Device<gfx_backend_vulkan::Backend> for gfx_backend_vulkan::Device>::create_graphics_pipeline ()
#7  0x000055ff582d2f75 in wgpu_core::device::Device<B>::create_render_pipeline ()
#8  0x000055ff581b9771 in wgpu_core::device::<impl wgpu_core::hub::Global<G>>::device_create_render_pipeline ()
#9  0x000055ff5814314a in <wgpu::backend::direct::Context as wgpu::Context>::device_create_render_pipeline ()
#10 0x000055ff5816fbcb in wgpu::Device::create_render_pipeline ()
#11 0x000055ff58033206 in <texture_arrays::Example as texture_arrays::framework::Example>::init ()
#12 0x000055ff5808f035 in texture_arrays::framework::start ()
#13 0x000055ff5808f2aa in texture_arrays::framework::run ()
#14 0x000055ff58033b02 in texture_arrays::main ()
#15 0x000055ff580916c3 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#16 0x000055ff580916d9 in std::rt::lang_start::{{closure}} ()
#17 0x000055ff585fc497 in core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once () at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:259
#18 std::panicking::try::do_call () at library/std/src/panicking.rs:381
#19 std::panicking::try () at library/std/src/panicking.rs:345
#20 std::panic::catch_unwind () at library/std/src/panic.rs:382
#21 std::rt::lang_start_internal () at library/std/src/rt.rs:51
#22 0x000055ff58033b32 in main ()

Same backtrace on master:

(gdb) bt
#0  0x00007fe024f85105 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#1  0x00007fe024dca8fd in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#2  0x00007fe024dc0d94 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#3  0x00007fe024dc4662 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#4  0x00007fe024dc70c4 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#5  0x00007fe024dc71c6 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so
#6  0x000055cc61ea62a1 in gfx_backend_vulkan::device::<impl gfx_hal::device::Device<gfx_backend_vulkan::Backend> for gfx_backend_vulkan::Device>::create_graphics_pipeline ()
#7  0x000055cc61da1a8c in wgpu_core::device::Device<B>::create_render_pipeline ()
#8  0x000055cc61e3e6db in wgpu_core::device::<impl wgpu_core::hub::Global<G>>::device_create_render_pipeline ()
#9  0x000055cc61dba4aa in <wgpu::backend::direct::Context as wgpu::Context>::device_create_render_pipeline ()
#10 0x000055cc61caac2b in wgpu::Device::create_render_pipeline ()
#11 0x000055cc61bde400 in <texture_arrays::Example as texture_arrays::framework::Example>::init ()
#12 0x000055cc61c0e335 in texture_arrays::framework::start ()
#13 0x000055cc61c0e5aa in texture_arrays::framework::run ()
#14 0x000055cc61bded02 in texture_arrays::main ()
#15 0x000055cc61c109a3 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#16 0x000055cc61c109b9 in std::rt::lang_start::{{closure}} ()
#17 0x000055cc6217ff07 in core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once () at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:259
#18 std::panicking::try::do_call () at library/std/src/panicking.rs:381
#19 std::panicking::try () at library/std/src/panicking.rs:345
#20 std::panic::catch_unwind () at library/std/src/panic.rs:382
#21 std::rt::lang_start_internal () at library/std/src/rt.rs:51
#22 0x000055cc61bded32 in main ()

I will try to run the same on a different machine.

Different machine (also Pop_OS!) with Intel graphics, same error on master.

kvark commented

@cwfitzgerald would you mind having a look at the texture-array crash?
I'll make sure the water example passes validation (suprised that it's not!).

I don't have a machine that I would be able to reproduce this on available to me, unfortunately.

I'm running into this as well, but I'm only seeing the validation error in Renderdoc (and then Renderdoc becomes unusable).

48 API High Miscellaneous 1158829633 Validation Error: [ VUID-VkRenderPassBeginInfo-framebuffer-04627 ] 
 Object 0: handle = Render Pass 258, type = VK_OBJECT_TYPE_RENDER_PASS; | MessageID = 0x45125641 | 
 vkCmdBeginRenderPass(): Image view #1 created from an image with usage set as 0x27, but image info #1
 used to create the framebuffer had usage set as 0x26 The Vulkan spec states: If framebuffer was created 
 with a VkFramebufferCreateInfo::flags value that included VK_FRAMEBUFFER_CREATE_IMAGELESS_BIT, 
 each element of the pAttachments member of a VkRenderPassAttachmentBeginInfo structure included 
 in the pNext chain must be a VkImageView with an inherited usage equal to the usage member of the 
 corresponding element of VkFramebufferAttachmentsCreateInfo::pAttachmentImageInfos used to create 
 framebuffer (https://vulkan.lunarg.com/doc/view/1.2.170.0/windows/1.2-extensions/vkspec.html#VUID-VkRenderPassBeginInfo-framebuffer-04627)

After opening a capture and attempting to inspect draw calls (e.g. look at the mesh data), Renderdoc will show all zeros for the vertex data, show no data, or crash. When this happens, all future attempts to open a capture fail with Replaying the capture failed at the API level until Renderdoc is restarted. In some cases VK_ERROR_DEVICE_LOST will show up in the diagnostic log, but not always.

In all cases, the validation error pasted above is repeated in the diagnostic logs. I'm on Windows 10 with the latest version of Renderdoc (1.13) and the latest Vulkan SDK (1.2.170.0). It was also failing with SDK version 1.2.135.0 too.

I ran into this while trying to inspect the bevy examples. It was also reported here: bevyengine/bevy#1813 and baldurk/renderdoc#2231

At first it does seem to be an issue with Renderdoc (and may very well be), however I'm able to run other vulkan code with the validation layers enabled and Renderdoc has no problem.

kvark commented

Thank you for reporting this to RenderDoc! If something runs validation-free by itself, but triggers validation errors in RederDoc, it means that the intercepting layer of RenderDoc did not do a good job, and it's at fault.

sotrh commented

Someone using my tutorial is having similar issues. sotrh/learn-wgpu#165

Try the nightly version of Renderdoc.

ecton commented

Someone using my tutorial is having similar issues. sotrh/learn-wgpu#165

I am having the issue linked here -- if I try to run cargo run --example texture-arrays on master or the v0.8 tag, it prints the error:

cargo run --example texture-arrays
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/examples/texture-arrays`
Using GeForce RTX 2070 (Vulkan)
[2021-05-14T18:35:33Z ERROR wgpu_core::validation] Unexpected varying type: Array { base: [1], size: Constant([5]), stride: 4 }
wgpu error: Validation Error

Caused by:
    In Device::create_shader_module
      note: label = `unsized-non-uniform.frag.spv`
    Failed to parse WGSL

My main project, however, is running fine but segfaulting during shutdown/window close. I was trying to track it down and narrow it down when I realized I can't run the wgpu texture-arrays example because of the validation error. This made me question what's unique about my machine.

I'm running Manjaro Linux with a GeForce 2070. I feel like my crash on segfault is a separate issue, so I don't want to post too much about it here (will likely post a follow-up issue once I dig in a bit more), but I wanted to add a "me too" to this particular issue and volunteer to try to help diagnose this further.

kvark commented

@ecton this example isn't going to validate properly in the near future, we just need to disable validation in it.
Please file an issue about your shutdown crash separately!

Closing due to wgpu-rs -> wgpu transition and because this is targeting 0.7. Please refile an issue on the wgpu repo if this is still a problem on master.