GPUOpen-Drivers/xgl

vkResetCommandPool slower than recreating, regardless of TRANSIENT_BIT

farnoy opened this issue · 5 comments

Hi, I hope this is the right repository,

I'm testing out a pattern where instead of using RESET_COMMAND_BUFFER_BIT and freeing individual command buffers, I reset the entire pool. I currently record 1 command buffer with 1 vkCmdFillBuffer, 1 barrier, then bind things and do 30 vkCmdDispatches with it. On the next frame, I make sure to synchronize and make sure the execution of that CB finishes, then I tried resetting the entire pool. Without the transient flag, it takes around 13ms, while with the transient flag enabled, it's still 4ms. So I ended up destroying it instead and recreating a brand new command pool to replace it, which is pretty much free by comparison, doesn't even register when profiling.

Is that a known performance cliff?

But it doesn't cost anything if I reset it right before destroying & then swap it with a brand new one each frame. So maybe it's accruing something after multiple allocate & reset cycles? I tried to reset it each frame with RELEASE_RESOURCES_BIT without creating a new one but it doesn't have any effect on runtime.

Our driver doesn't do anything with VK_COMMAND_POOL_CREATE_TRANSIENT_BIT so I don't know why that affects performance.

It isn't clear to me what you're measuring. Are you measuring the time for frame including rendering and synchronizing or are you just measuring the time to reset a pool with a single command buffer in it?

I can't currently re-test, but yes I was measuring only the vkResetCommandPool call, after the command buffer allocated from it finished executing.

You're finding that doing vkDestroyCommandPool + vkCreateCommandPool + vkAllocateCommandBuffers is faster than just doing vkResetCommandPool or vkResetCommmandBuffer?

Yes, this path vkDestroyCommandPool + vkCreateCommandPool + vkAllocateCommandBuffers is faster than vkResetCommandPool in my usecase. I haven't measured vkResetCommandBuffer alone, I'm migrating my app away from single use CBs that I used to free with vkFreeCommandBuffer.