fraunhoferhhi/vvdec

Decoding fails an a valid file?

birdie-github opened this issue ยท 30 comments

AV: 00:09:32 / unknown (100%) A-V:  0.000 Dropped: 8321
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - (possibly recoverable) exception Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
Error while decoding frame!
[ffmpeg/video] libvvdec: error in vvdec::decode - ret:-11 - restart required, please reinit. Exception occured: 
[ffmpeg/video] ERROR: In function "void vvdec::DecSlice::parseSlice(vvdec::Slice*, vvdec::InputBitstream*, int)" in /tmp/vvdec-2.1.3/source/Lib/DecoderLib/DecSlice.cpp:172: Expecting a terminating bit
[ffmpeg/video] ERROR CONDITION: !binVal
[ffmpeg/video] libvvdec: Too many errors when draining, this is a bug. Stop draining and force EOF.
Error while decoding frame!

I'm using mpv + ffmpeg 6.0.1 patched with vvdec support.

The raw H.266 file can be downloaded here: https://mega.nz/file/PgNiwLzA#pIRNACUVTpD1xcbWlo0Il4p_vv2E5nE4l5c-1Y5_hno

The source file was encoded using VVENC 1.10.0 this way:

ffmpeg -i *mp4 -vf scale=1024x576:flags=lanczos,fps=30 -f yuv4mpegpipe - | nice -20 vvencapp --y4m -i - --preset slow -q 31 -o y4m.266

vvenc [info]: stats summary: frame= 192931 avg_fps= 0.5 avg_bitrate= 638.11 kbps
vvenc [info]: stats summary: frame I: 6029, kbps: 9202.92, AvgQP: 25.25
vvenc [info]: stats summary: frame P:    0, kbps:     nan, AvgQP: nan
vvenc [info]: stats summary: frame B: 186902, kbps:  361.83, AvgQP: 36.34


vvenc [info]:   Total Frames |   Bitrate     Y-PSNR    U-PSNR    V-PSNR    YUV-PSNR
vvenc [info]:      192931    a     638.1111   37.7132   43.8204   43.9743   37.6809
vvencapp [info]: Total Time: 401380.129 sec. Fps(avg): 0.481 encoded Frames 192931

Thanks for the report. Will look into it.

Can also reproduce with pure VVdeC, so not an FFmpeg integration issue.

Cannot decode with VTM either. Might be an encoder bug

Hi @birdie-github , so I after trying around for a while and seeing inconclusive evidence I took a step back and see issues with your report.

The posted log states you encoded 192931 frames at 30 fps, resulting in 638.11kbps average bitrate. That means your resulting output should be filesize=(192931*638110)/(8*30) byte (num frames * bitrate / ( 8 (bit to byte) * fps ) ).

So the output produced should be around 498 MB. The file you provided is exactly 50000000B large, which is roughly 450 MB short of the expected output.

The errors I am experiencing are caused by the last NAL unit being randomly cut in half, which is not correct. The question now is, what is the cause of your errors? Are you trying to play the full-length file and the random cutting was caused by mega.nz upload for us or are your errors also caused by the random cuts at your system already. Please let me know.

One possible cause of the problem could be, vvenc does not handle out-of-space writing errors properly. We are working on improving it.

Maybe you run out of space when encoding but the encoder didn't catch that?

See fraunhoferhhi/vvenc#341.

Will close in a few days if no feedback.

Hello Adam! Sorry for being unresponsive, I was on the road.

The 50MB file was simply cut from the full encoded file manually. I am now checking whether the full file is equally affected, maybe you're right, it's just that 9:20 is the maximum length of this 50MB extract.

And, you're exactly right, sorry for the noise. The problem goes away for the full file.

All good. Related or not, we still caught an issues from that report, even if in another project.

@adamjw24

  1. I wonder if vvdec could be made a little bit more resilient, so that it could detect when input is cut off unexpectedly and say so, instead of showing cryptic decoding errors.

In case you need it, here's the full encode (muxed using gpac/mp4box).

  1. BTW vvdec looks extremely CPU heavy for such a low res 30fps encode.

My Ryzen 7 5800X requires whopping 60W (21W is idle power consumption, so the actual power consumption is ~40W) to decode it.

My laptop Ryzen 7 7840HS goes above 20W and reaches temps over 80C.

I've read that VVC/H.266 in terms of decoding complexity of is twice as heavy as H.265/HEVC but it looks more like 10 times as heavy.

  1. You can enable error resilience using -eh 1. Not sure if ffmpeg uses it. In your case tho, there is nothing to recover to, as there are no further frames.

  2. That is very strange. Your file with 576p, 30fps at 640kbps is indeed very light. My CPU (i9-11900H, with 2.5GHz base freq, so way less than the ryzen 7 5580x) can do 200fps+ on this file single threaded, almost 1000fps default (in WSL1, so with an extra layer in there). We use blocking, not busy waiting, so CPU resources should free up if you are decoding at playback speed. Don't know about the energy consumption though. How much is the CPU usage?

  • Are you sure you are using the release build? To check, how big is your vvdecapp (or your libvvdec.a/so)?

VVC should not use more than 2x HEVC, but of course this only applies for comparable implementations. There are faster commercial decoders out there, but VVdeC is fairly good actually.

Are you sure you are using the release build? To check, how big is your vvdecapp (or your libvvdec.a/so)?

Here's vvdec-2.1.3 built from the official sources without any patches or anything on Fedora 39:

libvvdec-2.1.3.tar.gz

It's 2'390'792 bytes. FFMpeg 6.0 is linked to it dynamically.

Power consumption in Linux can be seen by just playing it using ffplay or mpv:

sudo /sbin/setcap cap_sys_rawio,cap_sys_nice=+ep /usr/bin/turbostat
while :; do
    turbostat --quiet --num_iterations 1 --interval 1 --show PkgWatt --Summary | tail -1
    sleep 1
done

Looks good.

Which makes your issue very strange. If you have some time to help me better understand this...

  • which compiler is used?
  • what does the prompt say about the used SIMD extension?
  • does changing the SIMD make a difference wrt to used power and performance?
  • if you decode using the app, how many fps does the app achieve (-f 1000 should be enough)? What about -t 0?

which compiler is used?

gcc-13.2.1-6.fc39.x86_64 (the standard default compiler in Fedora 39)

what does the prompt say about the used SIMD extension?

How can I learn this?

if you decode using the app, how many fps does the app achieve (-f 1000 should be enough)? What about -t 0?

What's the command to check this?

which compiler is used?

gcc-13.2.1-6.fc39.x86_64 (the standard default compiler in Fedora 39)

Should not be a problem.

what does the prompt say about the used SIMD extension?

How can I learn this?

See the following example, based on your faulty bitstream:

image

In the first line, last part, in my case SIMD=AVX2.

if you decode using the app, how many fps does the app achieve (-f 1000 should be enough)? What about -t 0?

What's the command to check this?

Again, based on the previous bitstream:

image

Edit1: The last command you could try with both -t 0 and without (default, as many as your CPU can handle).

Edit2: vvdecapp is not installed per default, but can be found in the bin/release-shared folder of the build directory

./vvdecapp -b y4m.266 -f 1000 -v 3
VVdeC, the Fraunhofer VVC/H.266 decoder, version 2.1.3 [THREADS=16; PARSE_DELAY=16; SIMD=AVX2]
vvdecapp [info]: SizeInfo: 1024x576 (10b)
vvdecapp [info]: decoded Frames: 793 Fps: 793
vvdecapp [info]: 2024-Jan-03 18:30:37.674913: 1000 frames decoded @ 809.717 fps (1.235 sec)

Looks good for Ryzen 7 7840HS.

However, just to be extra sure I did this and power consumption remains the same, around 16W:

./vvdecapp --output - --y4m --bitstream y4m.266 2>/dev/zero | mpv --no-config -vo null -
VVdeC, the Fraunhofer VVC/H.266 decoder, version 2.1.3 [THREADS=16; PARSE_DELAY=16; SIMD=AVX2]
[file] Reading from stdin...
 (+) Video --vid=1 (rawvideo 1024x576 30.000fps)
VO: [null] 1024x576 yuv420p10

Just to compare it to HEVC. Here are some samples, https://www.elecard.com/videos let's use this one, 1280p, 60fps. On Windows you could use e.g. HWiNFO64 sensors (you can place individual sensors in your systray) to see power consumption, it's called CPU Package Power.

mpv --no-config -vo null -loop TSU_1280x720.mp4

Average power consumption observed on 7840HS: 5W (1280p, 60fps) vs over 16W for VVC encode (1024p, 30fps). Three times more CPU expensive. I'm using a built-in HEVC decoder in FFMPEG 6.0.1. I'm doubtful any mobile device other than the two/three latest generations of iPhones will be able to decode 1080p 60fps VVC in real time in software mode.

Let's try an AV1 clip, again 1280p, 60fps:

mpv --no-config -vo null -loop CityHall_1280x720.webm 

Power consumption observed: 5W.

I will try to use the ffmpeg's built-in VVC decoder later.

Thanks a lot!

3x is actually alright. They do good job at FFmpeg at optimizing. We also don't know how much of HEVC they really used in the bitstream. With VVenC slow you're basically getting all of VVC.

Your FPS is good as well. And you are using an internal system buffer (pipe) which is some workload as well.

So where does the 10x with 60W come from? Thats what you get when you use FFmpeg or mpv patched with VVdeC?

Average power consumption observed on 7840HS: 5W (1280p, 60fps) vs over 16W for VVC encode (1024p, 30fps). Three times more CPU expensive. I'm using a built-in HEVC decoder in FFMPEG 6.0.1. I'm doubtful any mobile device other than the two/three latest generations of iPhones will be able to decode 1080p 60fps VVC in real time in software mode.

My Pixel phone has been doing it for a year, since the patches were posted. Without even native NEON optimizations. Through an FFmpeg integration. So...

The threading implementation in the decoder is currently optimized for throughput and not for power efficiency. You can improve the situation by using less threads for smaller videos: Try setting the number of threads to 2 or 4 and the power consumption will probably go down.

3x is actually alright. They do good job at FFmpeg at optimizing. We also don't know how much of HEVC they really used in the bitstream. With VVenC slow you're basically getting all of VVC.

That's what I thought, so thank you for answering my concerns! The issue is solved then ๐Ÿ‘

So where does the 10x with 60W come from?

Ryzen 7 5800X idles at around 21W, so it's not 60W, it's 60-21=39W. I can force it down a lot by using power saving mode but the CPU prefers to burn watts (it keeps CPU cores running a lot faster than it's necessary to decode) for some reasons.

Thats what you get when you use FFmpeg or mpv patched with VVdeC?

Yeah, the same software setup (Fedora 39 + ffmpeg 6.0.1 patched with vvdec support + mpv), just different HW.

Try setting the number of threads to 2 or 4 and the power consumption will probably go down.

No idea how it can be done when using mpv with patched ffmpeg. You cannot play videos using vvdecapp you know ;-)

No idea how it can be done when using mpv with patched ffmpeg. You cannot play videos using vvdecapp you know ;-)

mpv --vd-lavc-threads=4

./vvdecapp -b y4m.266 -f 1000 -v 3
VVdeC, the Fraunhofer VVC/H.266 decoder, version 2.1.3 [THREADS=16; PARSE_DELAY=16; SIMD=AVX2]
vvdecapp [info]: SizeInfo: 1024x576 (10b)
vvdecapp [info]: decoded Frames: 793 Fps: 793
vvdecapp [info]: 2024-Jan-03 18:30:37.674913: 1000 frames decoded @ 809.717 fps (1.235 sec)

Looks good for Ryzen 7 7840HS.

However, just to be extra sure I did this and power consumption remains the same, around 16W:

./vvdecapp --output - --y4m --bitstream y4m.266 2>/dev/zero | mpv --no-config -vo null -
VVdeC, the Fraunhofer VVC/H.266 decoder, version 2.1.3 [THREADS=16; PARSE_DELAY=16; SIMD=AVX2]
[file] Reading from stdin...
 (+) Video --vid=1 (rawvideo 1024x576 30.000fps)
VO: [null] 1024x576 yuv420p10

Just to compare it to HEVC. Here are some samples, https://www.elecard.com/videos let's use this one, 1280p, 60fps. On Windows you could use e.g. HWiNFO64 sensors (you can place individual sensors in your systray) to see power consumption, it's called CPU Package Power.

mpv --no-config -vo null -loop TSU_1280x720.mp4

Average power consumption observed on 7840HS: 5W (1280p, 60fps) vs over 16W for VVC encode (1024p, 30fps). Three times more CPU expensive. I'm using a built-in HEVC decoder in FFMPEG 6.0.1. I'm doubtful any mobile device other than the two/three latest generations of iPhones will be able to decode 1080p 60fps VVC in real time in software mode.

Let's try an AV1 clip, again 1280p, 60fps:

mpv --no-config -vo null -loop CityHall_1280x720.webm 

Power consumption observed: 5W.

I will try to use the ffmpeg's built-in VVC decoder later.

Btw, your CPU supports both HW AV1 and HW HEVC decode. I would bet a lot on FFmpeg using the HW capability for decoding, when available. So the 5W you are measuring might be HW, totally incomparable.

https://www.amd.com/en/product/13041

mpv --vd-lavc-threads=4

Thanks!

Now it fluctuates between 13 and 14 watts on 7840HS, so we are down by around 2W (was 16.5W).

Btw, your CPU supports both HW AV1 and HW HEVC decode. I would bet a lot on FFmpeg using the HW capability for decoding, when available. So the 5W you are measuring might be HW, totally incomparable.

I tested software decode specifically by using --no-config which is confirmed by -v:

[vd] Container reported FPS: 0.000000
[vd] Codec list:
[vd]     libdav1d (av1) - dav1d AV1 decoder by VideoLAN
[vd]     av1 - Alliance for Open Media AV1
[vd] Opening decoder libdav1d
[vd] No hardware decoding requested.
[vd] Using software decoding.
[vd] Detected 16 logical cores.
[vd] Requesting 16 threads for decoding.
[ffmpeg/video] libdav1d: libdav1d 1.2.1
[vd] Selected codec: libdav1d (dav1d AV1 decoder by VideoLAN)

Hardware accelerated video decoding in mpv under Linux is not enabled by default.

I just played around with turbostat. When running on AC power the power consumption I am seeing on my laptop (Intel i9-8950HK) corresponds to your observations:
I also don't see a significant power reduction, when going to 2 or 4 threads - only when going to single threaded 26W vs 40W (PkgWatt).

But when running on battery it gets really weird:
Power consumption is worst, when running single threaded (30W) and gets better with more threads to around 10W.

So either there is something, that I really don't understand about CPU power consumption or there is something wrong with turbostat or the Linux CPU driver.

I'll investigate that.

The native VVC decoder in ffmpeg (revision 59686eaf33) shows much better results.

Power consumption for the same file playback is less than 8W which means it's at least twice as fast/efficient.

The native VVC decoder in ffmpeg (revision 59686eaf33) shows much better results.

Power consumption for the same file playback is less than 8W which means it's at least twice as fast/efficient.

Lol, no. More efficient, maybe. More faster? No way. In a year, maybe...

image

This is true in my testing as well but for some reasons vvdec consumes a lot more power during playback.

ffmpeg 6.0.1 + vvcdec:

ffmpeg -benchmark -i y4m.266 -t 60 -f null out.null
ffmpeg version 6.0.1 Copyright (c) 2000-2023 the FFmpeg developers
Input #0, h266, from 'y4m.266':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: vvc (Main 10), yuv420p10le(tv), 1024x576, 30 fps, 30 tbr, 1200k tbn
Stream mapping:
  Stream #0:0 -> #0:0 (vvc (libvvdec) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'out.null':
  Metadata:
    encoder         : Lavf60.3.100
  Stream #0:0: Video: wrapped_avframe, yuv420p10le(tv, progressive), 1024x576, q=2-31, 200 kb/s, 30 fps, 30 tbn
    Metadata:
      encoder         : Lavc60.3.100 wrapped_avframe
frame= 1774 fps=860 q=-0.0 Lsize=N/A time=00:00:59.96 bitrate=N/A speed=29.1x     speed=24.8x    
video:832kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
bench: utime=29.958s stime=1.238s rtime=2.063s
bench: maxrss=215208kB

ffmpeg's native VVC decoder:

ffmpeg -benchmark -i y4m.266 -t 60 -f null out.null
ffmpeg version N-113193-g59686eaf33 Copyright (c) 2000-2024 the FFmpeg developers
Input #0, vvc, from 'y4m.266':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: vvc (Main 10), yuv420p10le(tv), 1024x576, 25 fps, 30 tbr, 1200k tbn
Stream mapping:
  Stream #0:0 -> #0:0 (vvc (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'out.null':
  Metadata:
    encoder         : Lavf60.20.100
  Stream #0:0: Video: wrapped_avframe, yuv420p10le(tv, progressive), 1024x576, q=2-31, 200 kb/s, 30 fps, 30 tbn
      Metadata:
        encoder         : Lavc60.37.100 wrapped_avframe
[out#0/null @ 0x1705fc0] video:844kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 1800 fps=349 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A speed=11.6x    
bench: utime=65.737s stime=0.473s rtime=5.155s
bench: maxrss=291176kB

Benchmark-wise it's almost three times faster. I've no idea how it's even possible.

Let's compare perf stat.

ffmpeg + vvdec:

 Performance counter stats for 'ffmpeg -benchmark -i y4m.266 -t 60 -f null out.null':

         33,126.15 msec task-clock:u                     #   14.914 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
            54,241      page-faults:u                    #    1.637 K/sec                     
   117,651,325,924      cycles:u                         #    3.552 GHz                         (83.34%)
       547,702,449      stalled-cycles-frontend:u        #    0.47% frontend cycles idle        (83.35%)
     8,220,320,729      stalled-cycles-backend:u         #    6.99% backend cycles idle         (83.33%)
    95,667,735,635      instructions:u                   #    0.81  insn per cycle            
                                                  #    0.09  stalled cycles per insn     (83.33%)
    10,436,031,665      branches:u                       #  315.039 M/sec                       (83.31%)
       274,235,727      branch-misses:u                  #    2.63% of all branches             (83.36%)

       2.221112317 seconds time elapsed

      30.810787000 seconds user
       1.734843000 seconds sys

ffmpeg's native VVC decoder:

 Performance counter stats for '/tmp/fftest/usr/local/bin/ffmpeg -benchmark -i y4m.266 -t 60 -f null out.null':

         68,640.89 msec task-clock:u                     #   12.771 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
            72,286      page-faults:u                    #    1.053 K/sec                     
   234,741,877,414      cycles:u                         #    3.420 GHz                         (83.35%)
       322,797,165      stalled-cycles-frontend:u        #    0.14% frontend cycles idle        (83.31%)
     5,666,976,324      stalled-cycles-backend:u         #    2.41% backend cycles idle         (83.33%)
   410,165,403,950      instructions:u                   #    1.75  insn per cycle            
                                                  #    0.01  stalled cycles per insn     (83.29%)
    26,975,304,583      branches:u                       #  392.992 M/sec                       (83.36%)
       618,862,414      branch-misses:u                  #    2.29% of all branches             (83.35%)

       5.374933726 seconds time elapsed

      66.765453000 seconds user
       0.955226000 seconds sys

Four times more instructions.

ffplay (and mpv), which I'm using to measure power consumption, is a different application though.

OK, here we have something!

ffplay + native VVC decoder:

( sleep 60; killall ffplay ) & perf stat /tmp/fftest/usr/local/bin/ffplay y4m.266 2>&1 | grep -v "ffplay_buffer"
...
 Performance counter stats for '/tmp/fftest/usr/local/bin/ffplay y4m.266':

         61,989.84 msec task-clock:u                     #    1.033 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
            75,900      page-faults:u                    #    1.224 K/sec                     
   209,194,185,801      cycles:u                         #    3.375 GHz                         (83.22%)
     1,022,848,182      stalled-cycles-frontend:u        #    0.49% frontend cycles idle        (83.42%)
     4,324,794,445      stalled-cycles-backend:u         #    2.07% backend cycles idle         (83.30%)
   423,105,014,844      instructions:u                   #    2.02  insn per cycle            
                                                  #    0.01  stalled cycles per insn     (83.31%)
    27,197,064,222      branches:u                       #  438.734 M/sec                       (83.39%)
       685,917,622      branch-misses:u                  #    2.52% of all branches             (83.37%)

      60.013007171 seconds time elapsed

      54.960438000 seconds user
       6.419155000 seconds sys

ffplay + vvcdec:

( sleep 60; killall ffplay ) & perf stat ffplay y4m.266 2>&1 | grep -v "ffplay_buffer"
...
 Performance counter stats for 'ffplay y4m.266':

        127,678.66 msec task-clock:u                     #    2.128 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
            58,185      page-faults:u                    #  455.714 /sec                      
   265,018,779,850      cycles:u                         #    2.076 GHz                         (83.38%)
     1,820,874,830      stalled-cycles-frontend:u        #    0.69% frontend cycles idle        (83.34%)
    19,669,705,117      stalled-cycles-backend:u         #    7.42% backend cycles idle         (83.26%)
   554,288,394,645      instructions:u                   #    2.09  insn per cycle            
                                                  #    0.04  stalled cycles per insn     (83.32%)
   140,273,527,951      branches:u                       #    1.099 G/sec                       (83.34%)
     1,098,984,578      branch-misses:u                  #    0.78% of all branches             (83.36%)

      60.008088507 seconds time elapsed

      77.153125000 seconds user
      49.218201000 seconds sys

Playback with vvdec is a lot more power intensive for some reasons. Don't ask me why. I'm not a low level programmer.

5 times more branches:u. 1.6 times more branch-misses:u. task-clock:u twice as high.

Something breaks when actually playing the file.

I've trimmed the video down to exactly 1 minute so that you could test yourself:

1minute-test-github-issue158.zip

It could be that the patch to integrate VVC decoding using vvdec in ffmpeg is not perfect.

K-os commented

I worked a little on the thread pool implementation: Now it won't keep one core busy any more, when decoding at a fixed frame rate. This should fix the issue with the power consumption, because the whole CPU can sleep in between frames.

Here is an example playing 30s of your sample video.

Before:

> turbostat --quiet --show Avg_MHz,Busy%,Bzy_MHz,PkgWatt --Summary sh -c './ffplay blast.mp4 -an -t 30 -v 0 -autoexit >/dev/null'
29.296818 sec
Avg_MHz	Busy%	Bzy_MHz	PkgWatt
933	23.66	3945	43.06

With PR #162:

./turbostat --quiet --show Avg_MHz,Busy%,Bzy_MHz,PkgWatt --Summary sh -c './ffplay blast.mp4 -an -t 30 -v 0 -autoexit >/dev/null'
29.311245 sec
Avg_MHz	Busy%	Bzy_MHz	PkgWatt
516	26.02	1981	14.33

Thanks for pointing the issue out to us. The threading implementation was developed with maximum throughput in mind, before there was an integration into FFmpeg or media players. So, looking at it with turbostat was quite helpful.

That's amazing news @K-os , thanks a ton! I will test this patch once it gets merged.

I've rebuilt the library and power consumption has indeed gone down though not as much as you observed.

vvdec git snapshot:

20.223084 sec
Avg_MHz	Busy%	Bzy_MHz	PkgWatt
364	9.95	3660	9.75

vvdec 2.1.3:

20.218901 sec
Avg_MHz	Busy%	Bzy_MHz	PkgWatt
589	14.66	4017	15.62

Still it's a major improvement, thanks!