undefined method `unpack' for nil:NilClass while fetching TIFF image size from partial data
Closed this issue · 3 comments
When trying to get TIFF image size from chunks of data, I end-up with the following exception:
Traceback (most recent call last):
5: from bin/console:14:in `<main>'
4: from (irb):17
3: from (irb):17:in `new'
2: from /Users/gottfrois/.rvm/gems/ruby-2.6.6/gems/image_size-2.0.2/lib/image_size.rb:65:in `initialize'
1: from /Users/gottfrois/.rvm/gems/ruby-2.6.6/gems/image_size-2.0.2/lib/image_size.rb:227:in `size_of_tiff'
NoMethodError (undefined method `unpack' for nil:NilClass)
Steps to reproduce:
2.6.6 :001 > require 'open-uri'
=> true
2.6.6 :002 > fd = URI.parse('https://effigis.com/wp-content/themes/effigis_2014/img/Airbus-Spot6-50cm-St-Benoit-du-Lac-Quebec-2014-09-04.tif').open('rb')
=> #<Tempfile:/var/folders/_9/fzxprxqn2_10xdbghn1fb7q40000gn/T/open-uri20200724-10967-n9efp3>
2.6.6 :003 > ImageSize.new(fd.read(100))
The idea is to get the image size without reading the full image for performance reasons. But reading about the TIFF file header format, I wonder if it's even possible? As far as I understand, it's supposed to give you the offset where the image starts, which might be a large number which points to a nil
value since all the data is not loaded...
Nevertheless, it might be a good idea to raise a ImageSize::FormatError
in this case instead of raising an undefined method error. What do you think?
Thank you for opening the issue, it was interesting to dig a bit deeper.
I've debugged what is read in the image to determine its size and unfortunately width and height information is at offsets 100_083_302
and 100_083_314
which is ~66kB from the end of the file. You are right, having meta information at the end of file is common for TIFF files. Second problem is that open-uri
fetches everything, so small chunk is read from tempfile containing completely downloaded files.
About returning ImageSize::FormatError
I certainly agree.
I wrote some quick'n'dirty code for getting tif size without downloading complete file. It brought also some ideas on improving reading local files.
require 'net/http'
require 'image_size'
class ImageSize
remove_const(:ImageReader)
class ImageReader
def initialize(uri)
raise ArgumentError, "expected instance of URI" unless uri.is_a?(URI)
@uri = uri
@http = Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https', keep_alive_timeout: 60)
@chunks = {}
end
# def size
# @size ||= Integer(http.head(uri)['Content-Length'])
# end
CHUNK = 1024
def [](offset, length)
# Ignoring the fact that requested data can span multiple chunks
chunk_number, chunk_offset = offset.divmod(CHUNK)
@chunks[chunk_number] ||= @http.get(@uri, 'Range' => "bytes=#{chunk_number * CHUNK}-#{(chunk_number + 1) * CHUNK - 1}").body
# Ignoring error handling
@chunks[chunk_number][chunk_offset, length]
end
end
end
p ImageSize.new(URI('https://effigis.com/wp-content/themes/effigis_2014/img/Airbus-Spot6-50cm-St-Benoit-du-Lac-Quebec-2014-09-04.tif'))
Great feedback!
I'm currently using your gem in mine and I simply rescued the no method error exception for now, but this is a hack more than anything...
https://github.com/gottfrois/image_info/blob/master/lib/image_info/parser.rb#L32
Not sure what we can do here other than cleanly raising a format error and improve the codebase not to expect "something" to be present at index "offset" when someone might have partially loaded the image in the first place.
@gottfrois Would be great if you can check wip branch.