PyAV-Org/PyAV

Expose `av_frame_make_writable`

abextm opened this issue · 2 comments

Overview

I'm currently decoding some vp9 video, modifying it's frames, and writing that back out. Because I am modifying the plane data in place, it corrupts subsequent frames since the decoder keeps previous frames around so it can apply inter-frame (de)compression. FFmpeg provides av_frame_make_writable to support this use case (basically it just copies the plane data if you don't own it)

Existing FFmpeg API

av_frame_make_writable

Expected PyAV API

I would expect a frame.make_writable()

Example:

for frame in cont.decode(video=0):
    frame.make_writable()
    modify_frame(frame)

Investigation

I called av_frame_make_writable with ctypes, which resolves the problem

Reproduction

import av
import ctypes
import numpy

# this is horrible -- do not do this
class ctAVFrame(ctypes.Structure):
	_fields_=[
		("_ob_head", ctypes.c_byte * object.__basicsize__),
		("vtable", ctypes.c_void_p),
		("ptr", ctypes.c_void_p),
	]
_av_frame_make_writable = ctypes.CDLL(av.video.frame.__file__).av_frame_make_writable
_av_frame_make_writable.argtypes = (ctypes.c_void_p, )
_av_frame_make_writable.restype = ctypes.c_int
def av_frame_make_writable(frame: av.VideoFrame):
	_av_frame_make_writable(ctypes.cast(id(frame), ctypes.POINTER(ctAVFrame)).contents.ptr)


in_cont = av.open("./input.webm")

out_cont = av.open(f"output.mkv", "w")
tmpl = in_cont.streams.video[0].codec_context
out_stream = out_cont.add_stream("libx264", tmpl.rate)
out_stream.options["preset"]="ultrafast"
out_stream.options["crf"]="28"
out_stream.width = tmpl.width
out_stream.height = tmpl.height
out_stream.pix_fmt = tmpl.pix_fmt
out_stream.thread_type = "AUTO"

for i, frame in enumerate(in_cont.decode(video=0)):
	if i % 20 == 0:
		# without this there is significant error
		# av_frame_make_writable(frame)
		numpy.frombuffer(frame.planes[0]).fill(0)

	out_cont.mux(out_stream.encode(frame))

	if i > 2000:
		break

out_cont.mux(out_stream.encode(None))
out_stream.close()

I was running this on a vp9 video from yt-dlp (though I would expect this to happen to anything with interframe compression). With the fix commented out there is significant amounts of error in the output.

Versions

  • OS: Arch Linux
  • PyAV runtime:
PyAV v12.1.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-mediafoundation --disable-videotoolbox --enable-fontconfig --enable-gmp --enable-gnutls --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libxcb --enable-libxml2 --enable-lzma --enable-zlib --enable-version3 --enable-libx264 --disable-libopenh264 --enable-libx265 --enable-libxvid --enable-gpl
library license: GPL version 3 or later
libavcodec     60. 31.102
libavdevice    60.  3.100
libavfilter     9. 12.100
libavformat    60. 16.100
libavutil      58. 29.100
libswresample   4. 12.100
libswscale      7.  5.100
  • PyAV build:
    binary wheel
  • FFmpeg:
    binary wheel

Additional context

It would be nice if av_frame_clone was exposed too, though I'm not sure if exposing it as .clone() makes particular sense, since it does not clone the actual plane data

@abextm What use case do you expect if av_frame_clone() is exposed?

In my case I was doing detection on some video and if the frame hit a case it would write it to an output stream. If it only was a marginal hit, I would also write it to a second output with some debugging data overlaid on top of it. Without av_frame_clone order becomes important because I have to write the unmodified frame before the drawing onto debug frame, which is sort of annoying. With av_frame_clone I could just clone, make_writable, then do whatever without affecting the non-debug feed.