etemesi254/zune-image

panic on decoding a malformed JPEG: "Input length is not half the size of the output length"

Closed this issue · 11 comments

The file to reproduce the issue: crash-9fd32fc56fdc9348a4f68a788c5cb3e96186994d

Found by the fuzzer decode_buffer with the in-repo fuzz corpus.

Backtrace:

thread '<unnamed>' panicked at /home/shnatsel/Code/zune-image/crates/zune-jpeg/src/upsampler/scalar.rs:12:5:
assertion `left == right` failed: Input length is not half the size of the output length
  left: 32
 right: 64
stack backtrace:
   0: rust_begin_unwind
             at /rustc/6b771f6b5a6c8b03b6322a9c77ac77cb346148f0/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/6b771f6b5a6c8b03b6322a9c77ac77cb346148f0/library/core/src/panicking.rs:72:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
   4: zune_jpeg::upsampler::scalar::upsample_horizontal
   5: zune_jpeg::upsampler::scalar::upsample_hv
   6: zune_jpeg::worker::upsample_and_color_convert_v
   7: zune_jpeg::mcu::<impl zune_jpeg::decoder::JpegDecoder<T>>::post_process
   8: zune_jpeg::mcu::<impl zune_jpeg::decoder::JpegDecoder<T>>::decode_mcu_ycbcr_baseline
   9: zune_jpeg::decoder::JpegDecoder<T>::decode_into
  10: zune_jpeg::decoder::JpegDecoder<T>::decode
  11: decode_buffer::_::__libfuzzer_sys_run
  12: rust_fuzzer_test_input
  13: std::panicking::try::do_call
  14: __rust_try
  15: LLVMFuzzerTestOneInput
  16: _ZN6fuzzer6Fuzzer15ExecuteCallbackEPKhm
  17: _ZN6fuzzer10RunOneTestEPNS_6FuzzerEPKcm
  18: _ZN6fuzzer12FuzzerDriverEPiPPPcPFiPKhmE
  19: main
  20: <unknown>
  21: __libc_start_main
  22: _start

On it

Some analysis

The attached image has the following components and sampling factors (items looking like 2hx2v mean 2 samples on horizontal axis, 2 samples on first component (Y component) on vertical axis for every 1 sample on horizontal axis and one sample on vertical axis for second component for... ), see https://en.wikipedia.org/wiki/Chroma_subsampling

Component 1: 2hx2v q=0
Component 2: 1hx1v q=1
Component 3: 2hx1v q=1

This roughly translates to upsample using the hv sampling method (4:2:0) on Component 2 and use the H sampling
method (4:2:2) on Component 3, this is usually non-standard but libjpeg-turbo works for it, so I at least have to try

The crash

The library (incorrectly) assumed that the usual sub-sampled components (Component 2 and 3) would always have the same sampling factor
hence it gets shocked when Component 3 has 2x the sample it needs

The solution

The way to solve this is to have each component upsample itself based on it's horizontal and vertical sampling factor.
We do not assume that the components have same sampling factor. This requires some heavy rewrites to fix this, hence timeline is a bit fuzzy

Consequences

This will cause an increase in memory usage of decoding all subsampled images to increase, which I believe is worthy as it
conforms with the standard practice

I'm not convinced a proper fix is worth the trouble, actually. This is a fuzzer-generated image; I do not recall seeing this failure mode on real-world images.

Please give me a few days to see if this can also occur in the real world - I think I still have my JPEG datasets left over from previous tests. If such images turn out to be both non-standard and exceedingly rare, increasing the memory usage for everyone else might not be worth it.

Okay, I found a real-world image that triggers a panic with the same message: slider125

It was in my corpus of random JPEGs scraped off the web based on a subset of CommonCrawl. It's only one file out of 55,000 but it's there.

1 in 50000 is a low probability , but maybe will branch and see how much of an effort it is to support such images

If it's too difficult I will just reject such images

Should be fixed now, would be nice if you ran it on your corpus checking for extreme decoding differences before i push a fix to crates io

Sure, I'll do it during the weekend. Just the before/after comparison, right?

Here are the divergences and decode failures after this PR:

zune-jpeg-scraped-post-issue-148-divergence.tar.gz

Decode failures archive is too big for Github: https://mega.nz/file/hhcHgCQR#R0oRQAYJHnaNvhFNs84ta0tQ6CqxsUj_1sxEcG7HfBk
Most of the files that fail to decode are just damaged, but there are some legitimate failures in there too.

For the decode failures, most aren't even images, and the ones that crash are legitimately corrupt ones.

For the divergence there are legitimate ones, but the image themselves aren't that perfect, e.g they end prematurely (ireland-carry), I will investigate and fix them but first I'd like to push one to crates.io with the fix that was for this since I don't see any divergences caused by this

Sounds good to me! Thanks!