intel/libva

"HW busy now" when calling vaSyncSurface sometimes

maxnelso opened this issue · 1 comments

Sorry for the vague question, but I'm not understanding the error I'm seeing from libva. My setup is as follows: I have two different video streams that I'm trying to decode simultaneously, so I have one VADisplay, two VAContextID's, and two different VASurfaceID's that I'm decoding to. Without getting too much in the weeds, I call something similar to this for each stream:

vaBeginPicture(hw_context_->display(), va_context_->context_id(), va_surface_id);
vaEndPicture(hw_context_->display(), va_context_->context_id());

Then, for each stream, I want to wait for the surface to be finished decoding:

vaSyncSurface(hw_context_->display(), va_surface_id);

However, some amount of time (maybe 60% of the time?), I get VA_STATUS_ERROR_HW_BUSY when calling vaSyncSurface. What does this error mean in the context of decoding? From the docs it appears like this error should only happen if you try and kick off two decoding jobs on the same context, but I have been careful to create two different VAAPI contexts.

Curiously, if I call vaSyncSurface immediately after starting the decoding job for one stream (thereby making the whole process serial, and defeating the purpose of having two different streams), then I no longer see this error. It seems like there is some contamination across VAAPI contexts? Is the flow I'm trying to do supported?

I looked at the trace file, but it didn't contain anything useful as far as I could tell. This is an HEVC stream if that is important! Thanks for any help!

  1. from your description , it should work to decode 2 stream simultaneously.

  2. the only request is , you should avoid same VA object was used in these two streams, please refer "multithread guide" in http://intel.github.io/libva/ . from your description, there should not be any such issue.

  3. HW_BUSY always means , there are something wrong in HW, and there should be a gpu hang or gpu reset, you could check dmesg to check it. most likely it should because same VA object was used cross context. if you are sure it is not the case. please file issue with reproducing method to backend driver repo, such as https://github.com/intel/media-driver