NVIDIA/NvPipe

Asynchronous Encoder and Best practices for high-performance

hoax-killer opened this issue · 1 comments

Hi,

I am trying to achieve high performance and not sure how to go about. Here is my plan:

Thread 1: Reading/Writing on network socket (using boost::asio)
Thread 2: Rendering using OpenGL textures
Thread 3: Encoding Rendered buffers (using NvPipe)

render_frame i -> push rendered frame into queue_1
pop frame from queue_1 and encode frame -> push encoded frame into queue_2
pop encoded frame from queue_2 and perform async_write() over network socket

Is there a way I can do the tasks in bold (e.g. render frame and encoding frame) in an asynchronous manner, such that the handler would do corresponding task of pushing rendered/encoded frame into the queue?

Can I use NvPipe to achieve asynchronous encoding? If yes, can I get some pointers on how to do it.

Thanks!

Thanks for reaching out, and sorry for the delay!

NvPipe itself is a synchronous API. However, you can use it in an asynchronous setup, where rendering, encoding and sending happen in different threads. Check out this paper for an idea how things could be realized.

To overlap rendering and encoding, I suggest doing the GL/CUDA interop manually in the rendering thread (i.e., don't use NvPipe's encodeGL convenience functions), and storing the raw image data in CUDA device memory for consumption by the encode thread.

In the paper, I had implemented a concurrent blocking queue, where the elements were basically pointers to frame-sized segments in a large device memory pool.

Feel free to check out the implemention of NvPipe_EncodeTexture for the necessary GL/CUDA interop code.

Hope this helps!
Tim