cloudinary/ssimulacra2

Support other than PNG as input format?

Closed this issue · 2 comments

Hello dear devs,

I was evaluation SSIMULACRA2 in a context of video quality assessment. I observer quite a huge overhead due to fact that ssimulacra2 cli tool does not support anything besides PNG for input. For instance, in order to calculate per-frame metrics workflow should extract all frames from source and distorted video, but extracting frames to PNG is relatively slow process (if compared to say tiff or bmp):

$ time ffmpeg -i raw5s.y4m frames/f%d.png
...
real    0m5.224s
user    0m39.461s
sys     0m0.556s

$ time ffmpeg -i raw5s.y4m frames/f%d.tiff
...
real    0m0.387s
user    0m2.028s
sys     0m0.405s

Do you have any plans to support other than PNG input, perhaps even video input?

PPM input (as well as other uncompressed formats like PGM, PAM, PFM, PGX) is supported, so you could use that.

PNG encoding can be very slow but it can also be very fast (e.g. using https://github.com/veluca93/fpnge). Perhaps you could open an issue at ffmpeg to make their png encoder faster, or at least have an option to use faster settings?

SSIMULACRA2 supports the same input formats as cjxl. With the current versions of ffmpeg and libjxl, that means you could also use lossless JPEG XL as an intermediate format:

ffmpeg -i raw5s.y4m -effort 1 -distance 0 frames/f%d.jxl

This is probably not much slower than using uncompressed files.

Adding y4m as an input format to cjxl could also be an option, there's already an open issue about that: libjxl/libjxl#2663. Doing it in a way that allows lossless y4m encoding (i.e. staying in chroma subsampled ycbcr) is a bit tricky though in terms of code plumbing. Doing y4m input that converts to rgb should be doable though.

In general for performance it's probably better to integrate ssimulacra2 directly into ffmpeg so no intermediate files are needed — and probably for VQA you'll also want to add some kind of temporal aspect (e.g. preprocessing to do blurring on both orig and distorted, with a radius locally proportional to the amount of motion/change in each region) which you cannot do with a per-frame approach.

@jonsneyers Thank You for detailed response! I will check jxl for frame compression.
I also hope that some day SSIMULACRA2 could be integrated into ffmpeg to simplify workflow for users.