Igalia/vkrunner

Avoid recreating the VkDevice for every test

bpeel opened this issue · 6 comments

bpeel commented

Currently when running multiple tests VkRunner will create and destroy the VkDevice for each test. It even goes as far as dlopening and dlclosing libvulkan and re-requesting the function pointers each time. In order to speed this up it would probably be better keep the same VkDevice open in case opening it is slow on some platforms.

One potential way to do this could be have a vr_config function to pass in an existing VkDevice into the library. This would also solve the problem about VkRunner having to decide itself which device to use. However there is a problem with this that when you open the device you are supposed to specify which features and extensions you want. Each script can specify which features and extensions it requires so maybe the application would need to know this information upfront when opening the device.

Another option could be to have VkRunner internally recognise when a test uses a different set of extensions and features from the previous test and only reopen the device in that case. That might be difficult to take advantage of from the CTS where you really want to be able to call vr_execute for one test at a time. Potentially we could make an opaque object called vr_executor which could be a place for vr_execute to attach its VkDevice and reuse it for subsequent calls to vr_execute.

Most of the CTS tests are using the so-called "default" device created here: vktTestCase.cpp. All SPIR-V and GLSL based tests definitely use this device. I don't think that external components should be creating other devices, it would be really strange because in a multi-device system it wouldn't be clear what is being tested.

Additionally, aren't dlopen/dlclose expensive operations? If so, we should reduce uses of these calls as much as possible, at least for the CTS purposes.

I understand the desire for CTS to control the creation of the Vulkan device, for its use case.
I also really like the convenience of vkrunner's all-in-one solution for simple cases.

I think the goals are compatible with the right kind of refactoring of vkrunner's functionality, along the lines described above.

So far I've heard that the main responsibilities are:

  1. load the Vulkan library and cache function pointers
  2. create an appropriately configured device (with the right extensions, etc.); needs test file input
  3. given an appropriately configured device, setup a pipeline and run it; needs test file input

CTS wants to do 1 and 2, but needs vkrunner's input for 2. So we would need an interaction there.

2a. Given an input file, extract the config info needed to make a device: Vulkan API version, list of extensions, features bits? (I don't know exactly everything here.)
2b. Given a list of device requirements (in same form as yielded by 2a), create an approriately configured device.

A nice thing about 2a is it can help the CTS determine whether the test applies at all in the first place. It also lets CTS group tests according to requirements if it wants to.

The interaction diagram looks like this: https://gist.github.com/dneto0/ad687d76dd43b63a38f7cc52d57522cb

(I made that with https://bramp.github.io/js-sequence-diagrams/ )

In standalone mode, the vkrunner driver would take the role of CTS in the interaction.

What do you think?

cc @jaebaek

bpeel commented

As far as I can tell from vktTestCase.cpp, it seems to just enable any extensions and features that it knows about if they are available. I think in that case we maybe don’t really need VkRunner to communicate the requirements back to CTS because it can just assume that if the features are available then they have been enabled. If it is given both the vkPhysicalDevice and the vkDevice then it can still verify that the device supports the requirements and report SKIP otherwise.

Yeah, was just going to comment that the default device has all available features enable. And then it is up to the tests themselves to check that a particular needed feature is indeed available.

bpeel commented

I’ve gone ahead and implemented a combination of most of the things in the original report.

The API has changed so it now looks like this:

    struct vr_executor *executor = vr_executor_new();
    struct vr_config *config = vr_config_new();
    vr_config_add_script(config, "my_test.shader_test");
    vr_executor_execute(executor, config);
    vr_executor_free(executor);
    vr_config_free(config);

The vr_executor is now a place for VkRunner to attach its VkDevice amongst other things. If it is used with multiple shader scripts then it will reuse the same device if the requirements are all the same.

There is an additional optional API function to set an external device like this:

/* Sets an externally created device to use for the execution. Note
 * that it is the callers responsibility to ensure that the device has
 * all the necessary features and extensions enabled for the tests.
 *
 * This function is optional and if it is not used then VkRunner will
 * search for an appropriate physical device and create the VkDevice
 * itself.
 */
void
vr_executor_set_device(struct vr_executor *executor,
                       void *lib_vulkan,
                       /* VkInstance */
                       void *instance,
                       /* VkPhysicalDevice */
                       void *physical_device,
                       int queue_family,
                       /* VkDevice */
                       void *device);

If that is called then VkRunner will use the existing device and load the function pointers from the already dlopen’d library.

The next step is to try and get CTS to use this but I think it’s not entirely straight forward to get access to the Vulkan library handle. It would be great if we could integrate VkRunner deeper into CTS so that a pointer to the vr_executor will be somewhere in the test context so that any test group can use VkRunner tests. If we do that then it might be easier to get access to the library handle. I’m sort of hoping someone who knows the CTS code better might do this.

From VkRunner’s point of view I think this issue is closed but we can reopen it or file a new one if the API isn’t suitable for CTS integration.

Thank you for discussing this issue.
The suggested APIs allows CTS to run VkRunner in the same process with the existing device.
Looks good to me.