Possible speedup for benchmark
jcupitt opened this issue · 3 comments
Hello, this project looks very interesting.
I don't know exlir, but I wonder if your small benchmark is hitting the libvips fast path?
https://github.com/kipcole9/image/blob/main/bench/vips_v_mogrify.exs
I'm not sure it's using sequential mode, and I don't think it's exploiting JPEG shrink-on-load (though I'm not certain, of course, sorry).
Here's a tiny benchmark in python to show the difference:
#!/usr/bin/python3
import sys
import time
import pyvips
print(f"random access, using thumbnail_image:")
start = time.time()
x = pyvips.Image.new_from_file(sys.argv[1])
x = x.thumbnail_image(256);
x.write_to_file(sys.argv[2])
end = time.time()
print(f"took {(end - start) * 1000:.2f}ms")
print(f"sequential access, using thumbnail:")
start = time.time()
x = pyvips.Image.thumbnail(sys.argv[1], 256)
x.write_to_file(sys.argv[2])
end = time.time()
print(f"took {(end - start) * 1000:.2f}ms")
I see:
$ ./bench.py ~/pics/nina.jpg x.jpg
random access, using thumbnail_image:
took 398.28ms
sequential access, using thumbnail:
took 39.30ms
Where nina.jpg
is a 6k x 4k jpeg.
You'll see a large drop in memory use too. If I comment out the thumbnail
version and just measure thumbnail_image
, I see:
$ /usr/bin/time -f %M:%e ./bench.py ~/pics/nina.jpg x.jpg
random access, using thumbnail_image:
took 434.63ms
sequential access, using thumbnail:
took 0.00ms
235708:0.59
ie. a peak of 240mb of ram. If I comment out the thumbnail_image
version and just time thumbnail
, I see:
$ /usr/bin/time -f %M:%e ./bench.py ~/pics/nina.jpg x.jpg
random access, using thumbnail_image:
took 0.00ms
sequential access, using thumbnail:
took 35.46ms
54208:0.17
Peak memory use of 54mb.
If you're curious, there's a chapter in the docs explaining how the access mode flag works:
https://www.libvips.org/API/current/How-it-opens-files.md.html
And a page on the wiki about how thumbnail works:
https://github.com/libvips/libvips/wiki/HOWTO----Image-shrinking
We ought to move that into the main docs and update it.
@jcupitt Thanks very much for checking in. Surprised my small little project got your attention!
I have read the docs you mentioned (read as much as I can get my hands on) and those early benchmarks were before I understood the difference in behaviour with thumbnail
and thumbnail_image
. So now my library implementation makes it easy to use both (pass a file path, it uses thumbnail
, pass an image
it uses thumbnail_image
.
I also default in my API for Image.open
to access=sequential
since that has the benefits your've outlined well. It feels like this is the most sane default for a library whose primary intent is streamed transformations. If you think that's a mistake, please let me know!
libvips
is a amazing work, great craft. And its functional orientation, immutability and multi-threaded nature aligns really well with the Erlang BEAM VM upon which Elixir runs.
I'll revisit the benchmarks when I finish up some image streaming work and will update this issue with the results. Many thanks again.
--Kip
Hi Kip, sure, that all sounds good. So the lines:
{:ok, image} = Image.open(image_path)
{:ok, image} = Image.resize(image, 250)
out_path = Temp.path!(suffix: ".jpg", basedir: temp_dir)
:ok = Image.write(image, out_path)
Turn into thumbnail
behind the scenes? That's very neat.
Yes, I started off in functional programming, so libvips is supposed to be a bit like Haskell (lazy, pure, memoization, etc.). I think you're the first person to say this!
The runtime on which this runs supports functions with the same name and arity but with different types of parameters. So in this case, Image.resize
can be called with either an Image
parameter or a string that is considered to be a pathname:
# We land here if `resize` is called with an image (%Vimage{} is syntax to referring to
# a data structure.
def resize(%Vimage{} = image, width, options) when is_size(width) do
with {:ok, options} <- Resize.validate_options(options) do
Operation.thumbnail_image(image, width, options)
end
end
# We land here if `resize` is called with a string. The `is_binary(image_path)` is
# called a `guard clause` and the function is called only if the parameters
# meet the guard conditions. In this language a string is a subtype of a binary
# type.
def resize(image_path, width, options) when is_binary(image_path) and is_size(width) do
with {:ok, options} <- Resize.validate_options(options),
{:ok, _file} = file_exists?(image_path) do
Operation.thumbnail(image_path, width, options)
end
end
I've taken up enough of your time, I'll close the issue with much thanks.