PyAV-Org/PyAV

Functional Difference In Data Stream between AV 11.0.0 and AV 12.0.0

RaubCamaioni opened this issue · 1 comments

Overview

Functional change in data stream returned data.
I am using PYAV to separate KLV data from mpegts stream.
PYAV 12.0.0 returns different start bytes then PYAV 11.0.0

Expected behavior

Consistent parsing behavior between PYAV versions unless indicated in change log.

Actual behavior

Change in data stream parsing behavior resulting in different start bytes in data stream.

Investigation

Uninstall the latest version of PYAV 12.0.0 and install PYAV 11.0.0
Running code that prints the starting 20 bytes and end 20 bytes of data stream for each packet.

Reproduction

Compare the start bytes of an mpegts stream that contains data streams.

import av
from pathlib import Path


def main(video: Path):
    with av.open(str(video)) as container:
        stream = container.demux()

        for packet in stream:

            if packet.stream.type == "data":
                print(f"packet pts: {packet.pts}")
                print(f"start bytes: {bytes(packet)[:20]}")  # different start bytes
                print(f"end bytes: {bytes(packet)[-20:]}")  # same end bytes

            if packet.stream.type == "video":
                pass


if __name__ == "__main__":
    from argparse import ArgumentParser

    parser = ArgumentParser()
    parser.add_argument("-i", "--input", type=str)
    args = parser.parse_args()
    main(Path(args.input))

Example output:

pyav-11.0.0:
packet pts: 42811911
start bytes: b'\xfa\xab\x941\xbb\x11J\xaa\xbd\xc6\xfa\x9e\xb1\xb6$Z\x82\x01s\x82'
end bytes: b'\x82P2\x02\x0c\xfd\x82P6\x01\x01\x82P7\x01\x01\x01\x02\x8f:'

pyav-12.0.0:
packet pts: 42811911
start bytes: b'\x11J\xaa\xbd\xc6\xfa\x9e\xb1\xb6$Z\x82\x01s\x82P&\x08\x00\x05'
end bytes: b'\x82P2\x02\x0c\xfd\x82P6\x01\x01\x82P7\x01\x01\x01\x02\x8f:'

Versions

  • OS: Debian GNU/Linux 12 (bookworm)
  • PyAV runtime:
PyAV v11.0.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-mediafoundation --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-lzma --enable-version3 --enable-zlib
library license: GPL version 3 or later
libavcodec     60.  3.100
libavdevice    60.  1.100
libavfilter     9.  3.100
libavformat    60.  3.100
libavutil      58.  2.100
libswresample   4. 10.100
libswscale      7.  1.100

PyAV v12.0.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-mediafoundation --disable-videotoolbox --enable-fontconfig --enable-gmp --enable-gnutls --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libxcb --enable-libxml2 --enable-lzma --enable-zlib --enable-version3 --enable-libx264 --disable-libopenh264 --enable-libx265 --enable-libxvid --enable-gpl
library license: GPL version 3 or later
libavcodec     60. 31.102
libavdevice    60.  3.100
libavfilter     9. 12.100
libavformat    60. 16.100
libavutil      58. 29.100
libswresample   4. 12.100
libswscale      7.  5.100

Research

Assuming this issue is from 12.0.0, there is no current info on the change in behavior.
The datastreams used to be directly parsable when passed to klvparser: https://github.com/paretech/klvdata
Now a varaible number of start bytes needs to be removed to properly parse data stream.

This is a valid change for a major.
We don't (and can't) guarantee that a data stream has the same data when we change ffmpeg version.