bkaradzic/bgfx

VK_ERROR_FRAGMENTED_POOL on Android oppo find X5 with Vulkan backend

cryforyou opened this issue · 3 comments

Describe the bug
Running sample cube on Android, Oppo Find X5 Pro, VK_ERROR_FRAGMENTED_POOL happened after several seconds.

To Reproduce
Steps to reproduce the behavior:

  1. using https://github.com/Nodrev/bgfx-android-activity for android running
  2. modify samples to use Vulkan backend;
  3. choose the cube sample
  4. Wait several seconds, crash happened, VK_ERROR_FRAGMENTED_POOL printed

Expected behavior
I want to run sample on Android

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
The vulkan backend create and free descriptorsets every frame, which cause the pool fragmented, and error on Mali driver. Can you consider cache the descriptorSets?

I can confirm that this still seems to be an issue, after a relatively small amount of time, there will be a fatal error generated on Mali devices here:

VK_CHECK(vkAllocateDescriptorSets(m_device, &dsai, &descriptorSet) );

This approach to creating descriptor sets also seems to be causing significant slow down on both Adreno and Mali devices.

A step in the right direction is to avoid free'ing descriptor sets entirely, and instead have a pool per frame of latency, which is reset at the start of each frame. Essentially changing the following, along with the code to create, destroy and allocate from a set of pools instead of only one (i.e. using m_cmd.m_currentFrameInFlight and m_numFramesInFlight):

// One per frame in flight instead:
VkDescriptorPool m_descriptorPool[BGFX_CONFIG_MAX_FRAME_LATENCY];
void vkDestroy(VkDescriptorSet& _obj)
{
	if (VK_NULL_HANDLE != _obj)
	{
		// Reset instead: vkFreeDescriptorSets(s_renderVK->m_device, s_renderVK->m_descriptorPool, 1, &_obj);
		_obj = VK_NULL_HANDLE;
	}
}
ScratchBufferVK& scratchBuffer = m_scratchBuffer[m_cmd.m_currentFrameInFlight];
scratchBuffer.reset();

// Reset is needed now, since we don't free each anymore:
VkDescriptorPool& descriptorPool = m_descriptorPool[m_cmd.m_currentFrameInFlight];
vkResetDescriptorPool( m_device, descriptorPool, 0 );

@cryforyou I can confirm this will fix Mali crashes, but it will not fix performance concerns. This is mostly due to updating so many descriptor sets each frame, even when they shouldn't have changed.

More information here: https://arm-software.github.io/vulkan_best_practice_for_mobile_developers/samples/performance/descriptor_management/descriptor_management_tutorial.html

T-rvw commented

+1 : I also meet this issue today in my built android application.

Thanks, @EvilTrev . Tried your solution, it works for my problem. https://github.com/CatDogEngine/bgfx/pull/4/files

d-s-h commented

Thank you guys.
I also faced the issue on my Samsung S21. The crash used to occur almost right after app start, but the mentioned fix helped.