pal1000/mesa-dist-win

23.1.4 msvc build regression: crashes on Server 2012 R2 and 2016 with 0xc0000005 upon context creation

Optiligence opened this issue · 12 comments

also broken with 23.1.5

no issue with Server 2019

https://ci.appveyor.com/project/knossos/knossos/builds/47779695
unfortunately needs QT_OPENGL_DLL=opengl32 to even pick it up (workaround for qt/qtbase@6c85067#diff-30d85ae3c18697e52d180edc79bb5eefcb74f970957dc73b3a766541088a5237)

Try setting GALLIUM_DRIVER=llvmpipe and GALLIUM_DRIVER=d3d12 respectively. This should rule out which driver is crashing. GALLIUM_DRIVER=d3d12 cannot work on Server 2012 R2 and even 2016 may be too old.

So explicitly selecting a driver works around this issue?

Also it appears knossos or rather QT is smart enough to handle context creation failure followed by clean exit otherwise things would blow up on VS2015/2017 MSVC d3d12 configurations.

We definitely need this reported upstream as I was able to reproduce it with GPU Caps Viewer (32-bit software) using both 23.1.9 and 23.2.1 and inform Mesa3D main MSVC build developer @jenatali.

What's the call stack for the crash?

Call stack with 23.1.9, but it can be reproduced with 23.2.1 too. GPU Caps Viewer is a 32-bit x86 app. Some PCs can't reproduce, maybe CPU matters. I couldn't reproduce on the Petrosky VPS I use to build Mesa3D running on AMD EPYC 7413.
Screenshot 2023-10-11 030846

>	ntdll.dll!RtlpWaitOnCriticalSection()	Unknown
 	ntdll.dll!RtlpEnterCriticalSectionContended()	Unknown
 	ntdll.dll!_RtlEnterCriticalSection@4()	Unknown
 	libgallium_wgl.dll!mtx_lock(mtx_t * mtx) Line 245	C
 	[Inline Frame] libgallium_wgl.dll!util_queue_kill_threads(util_queue *) Line 503	C
 	libgallium_wgl.dll!util_queue_destroy(util_queue * queue) Line 543	C
 	libgallium_wgl.dll!zink_internal_create_screen(const pipe_screen_config * config) Line 3091	C
 	libgallium_wgl.dll!zink_create_screen(sw_winsys * winsys, const pipe_screen_config * config) Line 3100	C
 	[Inline Frame] libgallium_wgl.dll!wgl_screen_create_by_name(HDC__ * driver, const char *) Line 99	C
 	libgallium_wgl.dll!wgl_screen_create(HDC__ * hDC) Line 143	C
 	[Inline Frame] libgallium_wgl.dll!init_screen(const stw_winsys * stw_winsys, HDC__ *) Line 95	C
 	libgallium_wgl.dll!stw_init_screen(HDC__ * hdc) Line 189	C
 	[Inline Frame] libgallium_wgl.dll!stw_pixelformat_get_extended_count(HDC__ *) Line 365	C
 	libgallium_wgl.dll!stw_pixelformat_choose(HDC__ * hdc, const tagPIXELFORMATDESCRIPTOR * ppfd) Line 515	C
 	opengl32.dll!wglChoosePixelFormat(HDC__ * hdc, const tagPIXELFORMATDESCRIPTOR * ppfd) Line 166	C
 	[External Code]	
 	[Frames below may be incorrect and/or missing, no symbols loaded for GPU_Caps_Viewer.exe]	

@zmike looks like a Zink bug if it fails to initialize, it tries to lock an invalid mutex during cleanup.

zmike commented

🤕

@Optiligence, please test 23.3.0-rrc1. It includes the fix worked by @zmike.

before crashing, it now prints

MESA: error: ZINK: failed to load vulkan-1.dll

23.2.1 does so as well

before crashing, it now prints

MESA: error: ZINK: failed to load vulkan-1.dll

So this issue returns. You can workaround it by installing Vulkan runtime.

23.2.1 does so as well

23.2.1 should crash just like 23.1 series as it doesn't include the fix which means this is a race condition.