LunarG/VulkanTools

vkconfig allows enabling layers it should not control

baldurk opened this issue · 14 comments

Someone reported a crash in RenderDoc, caused by use of vkconfig which somehow force enabled RenderDoc's layer.

Looking at the tool, it seems that it will indeed provide options to force enable or disable any layer installed on the system not just the validation layers etc that it's intended to configure?

Since there's no way for vkconfig to know if an arbitrary layer can safely be enabled without proper configuration I think vkconfig should have an allow list of only layers that it will configure and leave any unknown layers alone. Otherwise users could cause crashes and cause bugs for other applications like this.

Hi @baldurk !

Vulkan Configurator is not designed only for the SDK layers. For example, @ziga-lunarg would want to debug the shader object layer with RenderDoc and was hoping to order the layers in such a way RenderDoc would capture the calls in the shader object layer.

"Since there's no way for vkconfig to know if an arbitrary layer can safely be enabled without proper configuration". Do you mean the RenderDoc layer here?
What prevents RenderDoc's layer to be enabled and specifically ordered in Vulkan Configurator?

Actually with Vulkan Configurator 3 and improvements in the Vulkan Loader, we can control the layers execusion order independently from enabling the layer. Enabling the layers was previously requied to order the layers...

So maybe, we don't need to have the capatiblity to enable implicit layers. Would ordering the layers in Vulkan Configurator an issue for RenderDoc?

The important point is that RenderDoc is not a layer, RenderDoc is a debugger which currently uses a layer as an internal implementation detail which should only be used by the tool itself. The layer returns a failure if explicitly enabled at instance creation time and third party programs should not be interfering with RenderDoc internals and changing or activating anything like that. It may cause crashes as I've seen or in future it may not function at all.

I don't know what the vulkan loader implements for layer ordering but at least within vkconfig if you don't want to have an allowlist of working layers I would definitely ask that you add a blocklist of layers to exclude from the UI and include RenderDoc's in there. I'm going to try and also address this from the RenderDoc side so users are aware of problems but it would be good to fix from vkconfig's side as well.

And regarding the use case of debugging layers with RenderDoc ? Would only controlling the execution order in Vulkan Configurator an issue for RenderDoc ?

In Vulkan Configurator 2 and because of the Vulkan Loader design, we would enable implict layer only to control the execution order. In Vulkan Configurator 3, both capabilities are split so I think we can remove the capability of enabling implict layer.

Another use case is simply touse extension layers together with RenderDoc.

Debugging layers with RenderDoc is not an intended use case. My suggestion would be to create a testbed application that does any rendering or otherwise does the work you want on the application side of things. Debugging with other layers active should work fine (except on Android) as RenderDoc will attempt to enable any layers at instance creation time on replay that were used during capture.

My main concern for this issue is preventing problems and crashes for RenderDoc uses due to the use of vkconfig.

Debugging with other layers active should work fine (except on Android) as RenderDoc will attempt to enable any layers at instance creation time on replay that were used during capture.

Do you mean RenderDoc only supports enabling layer programmatically (using the Vulkan API)?

I'm not quite sure I follow what you mean. On replay RenderDoc requests any layers the application did through the API as those are the ones that might be needed, though they are not treated as a hard requirement like extensions - if some are not available they will be dropped and the capture will try to load anyway.

It doesn't try to recreate the same environment variables that were set during capture. Implicit/global/etc layers are not expected to be something the application requested so RenderDoc will not request them on replay either. Whether they are enabled through the environment is up to whatever is set at any given point.

In the case of extension layers, for example shader object extension layer, if during capture the RenderDoc layer would come after the extension layer, during replay the extension layer would not have to be enabled. The replay would work without it, since the commands that required the extension layer would have been replaced. I know this is not an intended use case for you, but it would have been useful.

I've added a check for this on RenderDoc's end to help prevent user error. I don't know what the timeline is for releasing vkconfig3 but would it be feasible to add a change to the existing version of vkconfig to hide RenderDoc's layer from the UI? At least if there will be another SDK release before vkconfig3.

Testing for the next SDK release starts this Monday. Tight schedule and unlikely vkconfig2 will be modified in time. The following SDK is planned to deliver vkconfig3.

Even if we modify vkconfig2 for this current SDK release, not all users in the ecosystem immediately upgrade to it and there isn't a way to force the ecosystem over to the latest vkconfig2. So there will always be some users out there on the version of vkconfig2 without the change.

I can understand, I didn't know what the timeline was. Indeed it's not going to fix all cases but I thought at least better to start earlier than later, the warning on RenderDoc's side should help as well as people update that.

I'll leave this issue open to request the same blocking/hiding in vkconfig3.

I'll leave this issue open to request the same blocking/hiding in vkconfig3.

It doesn't seems reasonable to us to simply block/hide some layers. For example a common use case for a Vulkan developer is to disable all layers on the system to double check installed layers are not causing issues with their application.

Would the layer of Renderdoc cause any issues in that case for a user? Maybe not but it's not ridiculous for a Vulkan application developers to blanket make sure no layer are causing the issue there are trying to ping point without give further thoughts.

This said, we agree that forcing implicit layers is a risky endeavour which is one reason for the Vulkan Loader redesign of layer overriding. In Vulkan Configurator 2, we were already notifying the Vulkan developers that disabling implicit layer can cause issues including crashing. We are now doing it for enabling implicitly layer as well to resolve this issue and that will ship in next SDK.

We find it a bit unfortunate that Renderdoc prevents debugging Vulkan layers, having a split separated layer for recording and UI for replay is not an issue with gfxreconstruct but it's ok with us, that's your choice to make.

Yes I agree that the main problem is the layer being activated by other programs. When it comes to disabling layers, vkconfig wouldn't need to do anything special to disable all layers including RenderDoc's because RenderDoc's layer is never activated by default - if you do nothing it's already not active.

I don't really like that vkconfig will still allow users to do that even with a warning (that many users will just click through), but if this is a temporary measure and then it's blocked entirely in the next version then that's not a bad compromise.

I'm not really sure I understand what you mean in the last bit. RenderDoc doesn't prevent debugging vulkan layers, it's simply not a supported workflow and may or may not behave correctly. As I mentioned above, layers used during capture will be enabled during replay if available, otherwise RenderDoc does not interact with the layer system.