pywinrt/python-winsdk

How to use Direct3D11CaptureFrame with OpenCV / numpy ?

Avasam opened this issue · 6 comments

I can get from a Direct3D11CaptureFrame to a SoftwareBitmap like so:

from winsdk.windows.graphics.capture import Direct3D11CaptureFramePool
from winsdk.windows.foundation import IAsyncOperation, AsyncStatus
from winsdk.windows.graphics.imaging import SoftwareBitmap, BitmapBufferAccessMode

def __windows_graphics_capture(frame_pool: Optional[Direct3D11CaptureFramePool], selection: Region):
    if not frame_pool:
        return None

    frame = frame_pool.try_get_next_frame()
    if not frame:
        return None

    def callback(async_operation: IAsyncOperation[SoftwareBitmap], async_status: AsyncStatus):
        if async_status != AsyncStatus.COMPLETED:
            return
        software_bitmap = async_operation.get_results()
        # I got a software bitmap, now what? 

    async_operation = SoftwareBitmap.create_copy_from_surface_async(frame.surface)  # pyright: ignore
    async_operation.completed = callback # Should be done synchronously, check how to do so later

According to the official Microsoft doc, I could call LockBuffer, then CreateReference and finally GetBuffer to obtain a bytarray (https://docs.microsoft.com/en-us/windows/uwp/audio-video-camera/imaging#create-or-edit-a-softwarebitmap-programmatically)
But when I do reference = software_bitmap.lock_buffer(BitmapBufferAccessMode.READ_WRITE).create_reference(), reference has no method get_buffer. (maybe it's missing a binding? Also not sure that byte-array is even meaningful here)

The doc does have a section about integration with OpenCV: https://docs.microsoft.com/en-us/windows/uwp/audio-video-camera/process-software-bitmaps-with-opencv#3-implement-the-opencvhelper-class
I believe opencv2-python's equivalent of mat used to be cv2.CreateMat, which I believe is now simply using a numpy array. But I'm once again stuck trying to obtain the pPixelData, which is obtained from IMemoryBufferByteAccess::GetBuffer method

# Maybe this should work?
bitmap_buffer = software_bitmap.lock_buffer(BitmapBufferAccessMode.READ_WRITE)
memory_buffer_reference = bitmap_buffer.create_reference()
buffer = memory_buffer_reference.get_buffer()
image_as_np_array = np.frombuffer(buffer)
dlech commented

The Python wrappers for IBuffer and IMemoryBuffer implement the CPython buffer protocol, so they can usually be passed directly to other Python APIs that expect a bytes-like object.

If I try to use directly with numpy, I get the following error:

bitmap_buffer = software_bitmap.lock_buffer(BitmapBufferAccessMode.READ_WRITE)
np.frombuffer(bitmap_buffer, dtype=np.int8)

a bytes-like object is required, not '_winsdk_Windows_Graphics_Imaging.BitmapBuffer'

dlech commented

https://github.com/pywinrt/pywinrt/blob/6102997bbb08c37e80f16ddf97f181bd71403091/src/package/pywinrt/projection/readme.md#buffer-protocol

It looks like in the case of IMemoryBuffer you need to call create_reference() and then pass the return value from to numpy.

I was so close, thanks for steering me in the right direction!

software_bitmap = async_operation.get_results()
reference = software_bitmap.lock_buffer(BitmapBufferAccessMode.READ_WRITE).create_reference()
image = np.frombuffer(cast(bytes, reference), dtype=np.uint8)
dlech commented

Is it not possible without the cast? I guess that just make the linter happy?

software_bitmap = async_operation.get_results()
buffer = software_bitmap.lock_buffer(BitmapBufferAccessMode.READ_WRITE)
image = np.frombuffer(buffer.create_reference(), dtype=np.int8)

The cast is just for the linter when type checking, from https://github.com/numpy/numpy/blob/main/numpy/__init__.pyi#L1433:

# There is currently no exhaustive way to type the buffer protocol,
# as it is implemented exclusivelly in the C API (python/typing#593)
_SupportsBuffer = Union[
    bytes,
    bytearray,
    memoryview,
    _array.array[Any],
    mmap.mmap,
    NDArray[Any],
    generic,
]

(I've also updated my previous message with dtype=np.uint8)

It is fully working, I've been able to render the image in my application.