ease comparison with other formats

Question

ease comparison with other formats

tudorsuciu opened this issue 6 years ago · 4 comments

Hello, butteraugli is incredibly sensitive. I had a lot of issues comparing to other formats.
A simple ffmpeg -i image.jpg image.yuv needed to compare with other formats will completely ruin the butteraugli score because rgb24 -> yuv420 conversion. By default yuv will have limited range (luma goes from 16 to 235) that adds even more losses.

I propose in the following script a way to compare fairly pik to h264/h265/aom/jpg. It seems to win without much trouble if the wanted result is minimal change compared to original.
benchmark.txt

Is there any way to use yuv420p as input to pik?
Do you have any better settings for h264/hevc/av1?
(jpeg quality 93 has similar butteraugli score to pik at almost double size)

Answer 1 · 2019-02-22T08:43:13.000Z

Hi, and thank you for your interest in PIK :)

A simple ffmpeg -i image.jpg image.yuv needed to compare with other formats will completely ruin the butteraugli score because rgb24 -> yuv420 conversion.

That is to be expected, as 4:2:0 chroma subsampling is indeed a lossy processing step. https://en.wikipedia.org/wiki/Chroma_subsampling#4:2:0 and https://upload.wikimedia.org/wikipedia/commons/0/06/Colorcomp.jpg are examples of the kind of loss that one can expect.

Is there any way to use yuv420p as input to pik?

That is not recommended if one has access to the full original input. A fair comparison would start from the original for all codecs (if they need to subsample it, that’s “their problem”, so to speak), and would compare the decoded image of each codec with that same original image (be it with Butteraugli, by visual inspection…).

Do you have any better settings for h264/hevc/av1?

I believe that at least x264 and x265 support -pix_fmt yuvj444p. Does that give better results than yuvj420p? Chroma subsampling is one way to save space, but it’s not necessarily the best, and it’s possible that x264 and x265 may come up with better techniques.

Also, I noticed that the script produces a JPEG file out of the resizing+cropping. Is that intended?

Answer 2 · 2019-02-22T20:38:38.000Z

If you consider that the original is the yuv420p file, the conversion to rgb24 will double the data size.
It will also embed the specific algorithm used for upscaling (it could be considered noise) and pik will(does it?) strive to preserve this info.
There are a lot of real world cases of yuv422p input data.
If you consider that the complexity to support in pik yuv420/yuv422/yuv444/rgb is not worth the effort I can understand that. At least in theory a yuv420p pik image could be 2x times smaller.
All the other codecs support the yuv444, it's just that they are not optimized to it and the files that are already 2x the size of pik will jump to even more.
My script tries to reproduce the case:
I have a jpeg/heif, what is the smallest space that I can store the images with minimum quality loss?
All the new codecs deserve a benchmark! Pik is already very interesting for jpeg archival.

Answer 3 · 2019-02-25T12:47:57.000Z

On Thu, Feb 21, 2019 at 6:14 PM tudorsuciu ***@***.***> wrote: Hello, butteraugli is incredibly sensitive. I had a lot of issues comparing to other formats.

Butteraugli is a lot more sensitive for lines (displaced, emerging or removed) than any other visual measure. Also, butteraugli has been tuned for a viewing distance of 1000 pixels. Often practical viewing distance is 2000 pixels. The non-linearities near the just-noticeable-boundary in scale are enormous and can be difficult to reason about. Also, to be able to appreciate butteraugli 1.0 level error correctness you need to have a high quality monitor with a large gamma correction LUT. At larger errors (4.0+) butteraugli becomes less useful. Current versions of butteraugli only extrapolate these values as multiples of just-noticeable-difference, but the human visual system is highly non-linear and large extrapolation doesn't bring much value.

A simple ffmpeg -i image.jpg image.yuv needed to compare with other formats will completely ruin the butteraugli score because rgb24 -> yuv420 conversion. By default yuv will have limited range (luma goes from 16 to 235) that adds even more losses.

Some people developing video codecs have noted that ffmpeg's color conversions are not high quality and not possible to be used for high-quality photography compression. We wrote our own yuv-rgb-yuv conversion and got better results (IIRC, 0.2-0.3 butteraugli score difference).

I propose in the following script a way to compare fairly pik to h264/h265/aom/jpg. It seems to win without much trouble if the wanted result is minimal change compared to original. benchmark.txt <https://github.com/google/pik/files/2892114/benchmark.txt>

That matches my observations. Video codecs create good looking results, but do not care so much for authenticity. PIK cares a lot about authenticity and can deliver it with much less bits (possibly 2-3x less) than modern video codecs.

Is there any way to use yuv420p as input to pik?

No special way, other than just through RGB. This creates a 5-15 % penalty for PIK. I consider yuv420 something that should not be done for images in any stage. Particularly, demosaicking should not create images in yuv420 or yuv422. YUV420 and YUV422 seem mathematically and physiologically less wrong when one uses them with ICtCp. I don't have experience if PIK's XYB colorspace works with less penalty with the ICtCp YUV420, but it is very likely. I'd anticipate the penalty to be roughly halved when compared to against that.

Do you have any better settings for h264/hevc/av1?

Khavish on encode.ru might be able to recommend something. He did extensive studies between codecs.

Answer 4 · 2019-02-26T10:16:41.000Z

Some people developing video codecs have noted that ffmpeg's color conversions are not high quality and not possible to be used for high-quality photography compression. We wrote our own yuv-rgb-yuv conversion and got better results (IIRC, 0.2-0.3 butteraugli score difference).

It was probably true in the past, but now ffmpeg has a lot of plugins that do high quality colorspace conversion. They are usually vary badly documented. I needed to read the source in order to find the magic command line to have full range -> limited range conversion(with colorspace video plugin).
Swscale has also plenty of high quality chroma transformation options that are hard to find/quantify.
Could you point me to the sources of yuv->rgb->yuv with 0.2 butteraugli? I already have that for rgb -> yuvj444->rgb. Do you have it for rgb->yuv420->rgb?
With my custom code that I find to hackish to share in current shape I got rgb->yuv420->rgb to 0.8~0.9 butteraugli. But i'm not happy yet with the coefficients for transformations, there might be some off values that lose quality.