BoolkaEngine crash when using unaligned vertex struct in mesh shader
Opened this issue ยท 7 comments
Checklist [README]
- Device is not undervolted nor overclocked
- Device is using the latest drivers
- Application is not cracked, modded and use the latest patch
Application [Required]
BoolkaEngine
Processor / Processor Number [Required]
AMD Ryzen 5 3600 6-Core Processor
Graphic Card [Required]
Intel(R) Arc(TM) A770 Graphics
GPU Driver Version [Required]
31.0.101.5382
Other GPU Driver version
No response
Rendering API [Required]
- Vulkan
- OpenGL
- DirectX12
- DirectX11
- DirectX10
- DirectX9
- Not applicable
Windows Build Number [Required]
- Windows 11 23H2
- Windows 11 22H2
- Windows 11 21H2
- Windows 10 22H2
- Windows 10 21H2
- Other (Please specify)
Other Windows build number
No response
Intel System Support Utility report
Description and steps to reproduce [Required]
Download latest release of BoolkaEngine - https://github.com/Devaniti/BoolkaEngine/releases/tag/v0.2
Extract all files
Run start.bat
Application will start and immediately crash
BoolkaEngine is my D3D12 engine pet project.
It works just fine on Nvidia/AMD GPUs.
The issue seems to be related to execution of Mesh Shaders
There is a commit with a workaround for this crash - Devaniti/BoolkaEngine@02855c2
That workaround changes layout of the Vertex struct used with Mesh Shaders
Since there are no relevant limitations on the layout of vertex struct, it is highly likely that it is not UB inside BoolkaEngine, but rather mishandling of vertex struct layout in mesh shaders inside Intel driver
Device / Platform
No response
Crash dumps [Required, if applicable]
No response
Application / Windows logs
No response
@Devaniti hiii and welcome!
I provide support for Game/App developers and I will be assisting you in this case
Let me confirm this crash and I'll be back with my findings. If I have questions I'll ping you right back :)
Karen
Heey @Devaniti quick update!
I could verify the correct excecution of the scene using the build in my NVIDIA RTX 3050 but unfortunately it crashes in my ARC with driver v.5382. I have also performed a small regression like you suggested and the behavior is the same, so I'll be creating an internal report for this.
A couple questions for my report:
- Is there an official dx12 doc that you followed to find that the matrix should be returned the way you originally did it? If so, please share
- Can you share how many users (give or take) might be impacted?
Thanks, looking forward to hear from you :)
Karen
- The most relevant document is this one - https://microsoft.github.io/DirectX-Specs/d3d/MeshShader.html#vertex-attributes
There is no limit on the structure of the Vertex Attributes, only requirements to specify semantic for each field of the structure and have 4 component element with SV_Position semantic, which are fulfilled in both version before and after the workaround. - This is more of a code sample, and not application that people would actively use. So users that are impacted by the crash in the app itself is about 0, but people may use this code that crashing on Intel ARC as a reference in other projects. Either way I'd expect quite small number of people to be affected by this bug.
As for the workaround having same behavior, on my end workaround does fix the crash on Intel ARC.
To ensure that we are running same code in each case, you can build both versions from scratch:
- Clone the https://github.com/Devaniti/BoolkaEngine repo
- Run
HelperScripts/QuickStart.bat
onmain
branch to observe the crash - Run
HelperScripts/QuickStart.bat
on theIntelArcWorkaround
branch to observe it working with the workaround
That script will build the project, download and prepare the scene and run the app.
Ty @Devaniti
I have been able to run both branches, but I'd rather focus on the one without the WA and see what we can do on the driver side.
Edit: doing some research. Will update soon
Karen
@Devaniti Just a quick comment:
Looking at the document you shared
This function requires a 4-component vector, and looking at the WA code you just did that: change from int to int4 and float2 and float3 to float4, right?
So, it seems to me that the WA it is the correct way to use this function.
I assume that Nvidia/AMD drivers are converting those non 4-component vectors into 4 elements valid vectors.
Do you agree?
In Vertex Attributes, you are required to have one 4-component attribute with SV_Position semantic. If relevant attribute is not 4-component vector, shader compilation fails with error : SV_Position must be float4.
.
As you can see, both before and after workaround, there's float4 position : SV_Position;
, which satisfies that requirement, and the shaders successfully build in both cases.
And since the highlighted requirement does not limit other attributes, it is valid for other attributes to have other sizes.
This official mesh shader code sample uses non 4-component attributes as well - https://github.com/microsoft/DirectX-Graphics-Samples/blob/master/Samples/Desktop/D3D12MeshShaders/src/MeshletCull/MeshletCommon.hlsli https://github.com/microsoft/DirectX-Graphics-Samples/blob/master/Samples/Desktop/D3D12MeshShaders/src/MeshletCull/MeshletMS.hlsl