meh/rust-ffmpeg

Unsafe memory handling when decoding frames

MichaelRiss opened this issue · 3 comments

Memory handling for decoded AVFrames can either be the responsibility of the caller or of the codec, depending on whether AVCodecContext.refcounted_frames is set to 1 or to 0.
Explained here (parameter picture):
http://ffmpeg.org/doxygen/3.2/group__lavc__decoding.html#ga3ac51525b7ad8bca4ced9f3446e96532
If refcounted_frames is set to 0, the decoder owns the memory of the frames and will reuse this memory when decoding further frames. Therefore frames are only valid until the next call to the decoder and this can cause problems for a rust programmer that expects the obtained ffmpeg::util::frame::video::Video struct to remain valid until it gets destroyed.
In my case I was extensively moving the obtained frames around (even through various threads) and it took me a long time to realize that the origin of my problems (frame duplication/reordering, contrast/color separation) was the ffmpeg decoder overwriting the frames that I had already decoded and which were in various stages of further processing. Cloning the frames immediately after decoding solved the problem for me.

So far I'm not sure how to solve this problem "cleanly". Maybe the decoder could just return something like a borrowed reference to a frame and therefore force the rust programmer to either release the reference or to clone the frame before calling the decoder again.

meh commented

The issue is that it's decided at runtime whether you can keep the frame around or not.

I guess it's a matter of deciding whether to require the user to always clone the frame to pass it around, or clone it internally when refcounted_frames is 0.

What do you think is the best option?

I was thinking about something like this: https://is.gd/BxBCSC
Try to comment out the curly braces in the main function to see how I propose to use the borrow checker to force the programmer to drop all references to the current frame before calling the decoder again to get the next frame.

This way, we should get optimal performance in all situations.

  • Short, immediate access to the Frame data structure that is finished before decoding the next frame:

    Can be done through the borrowed reference obtained by get_next_frame() to the internal Frame struct encapsulating the AVFrame and requires no copying.
    Before decoding the next frame the compiler forces the programmer to drop any borrowed handles to this internal data structure, so no data corruption can take place.

  • Long-term access to Frame data structure beyond the decoding of the next frame:

    Can be done with the get_next_frame().to_cloned() method combination. With copy-cloning the frame memory, the decoder memory management gets decoupled from rust memory management and allows further processing of the frames independent of the decoder.

A final, maybe over-the-top, optimization could take place in the to_cloned()-method. In case refcounted_frame is 1 the internal Frame could just be moved to the caller instead of copy-cloning the entire frame data. That would be faster. But the downside is, that you can move out the frame data only one time. So this method needs to return a Option<Frame> with Some() for the first call and None the following calls.

There are the std::borrow::Cow and std::borrow::ToOwned traits that potentially could model this idea. However, so far I haven't been able to figure out what happens to objects on the stack when they get encapsulated with Cow, do they get copied to the Heap? Ideally I would like to avoid any unnecessary copy operation.