`zune-jpeg` 2x slower than `image` crate on my M1 MacBook
Closed this issue · 4 comments
I am really intrigued by zune-image
based on its compilation time and boasted speed, but my first impression is that it is much slower than the image
crate.
I am decoding 1440 × 1920 RGB JPEGs. image
takes around 17 ms to decode a 1440 × 1920 RGB JPEG, while zune-jpeg
takes around 33ms. I am NOT using the jpeg_rayon
feature of image
.
Here is one of the images that decodes slowly:
(the image comes from saving frames of videos from https://github.com/google-research-datasets/Objectron)
hi it's because our idct is not vectorized but jpeg decoder contains one, there was some effort to address that and there was a branch which achieved that , I was to manually merge it but time became an issue, and I don't have the hardware to test it
Merged neon optimized idct to dev
branch which should improve decode times considerably for Arm Neon chips.
I can't really test by how much, but would love if you could report speeds on your side
I'm sorry, my original number were bad somehow. I've now done better measurements and I've found this:
image
: 15.6 ms per jpegzune-jpeg 0.3.17
: 10.3 ms per jpegzune-jpeg
dev
branch: 8.2 ms per jpeg
I also measured in Firefox when compiling to WASM:
image
: 35 ms per jpegzune-jpeg 0.3.17
: 29 ms per jpegzune-jpeg
dev
branch: 29 ms per jpeg
So: seems like I'm switching to zune-jpeg 🤘😃
good to hear that
WASM speeds can be improved with using wasm intrinsics, If I happen to have time I may implement it but no promises there :)