x/image/tiff: index out of range
dvyukov opened this issue · 11 comments
The following program:
package main
import (
"bytes"
"golang.org/x/image/tiff"
"io/ioutil"
"os"
)
func main() {
data, _ := ioutil.ReadFile(os.Args[1])
tiff.Decode(bytes.NewReader(data))
}
on this file:
https://drive.google.com/file/d/0B20Uwp8Hs1oCdDhyRzAwbE5qc2M/view?usp=sharing
crashes as follows:
panic: runtime error: index out of range
goroutine 1 [running]:
io/ioutil.readAll.func1(0xc8200436c0)
src/io/ioutil/ioutil.go:30 +0x11e
golang.org/x/image/tiff/lzw.(*decoder).decode(0xc8200b1300)
src/golang.org/x/image/tiff/lzw/reader.go:187 +0x75b
golang.org/x/image/tiff/lzw.(*decoder).Read(0xc8200b1300, 0xc8200cb001, 0xdff, 0xdff, 0x201, 0x0, 0x0)
src/golang.org/x/image/tiff/lzw/reader.go:141 +0x19e
bytes.(*Buffer).ReadFrom(0xc820043618, 0x7f23a2e99518, 0xc8200b1300, 0x1001, 0x0, 0x0)
src/bytes/buffer.go:173 +0x23f
io/ioutil.readAll(0x7f23a2e99518, 0xc8200b1300, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0)
src/io/ioutil/ioutil.go:33 +0x154
io/ioutil.ReadAll(0x7f23a2e99518, 0xc8200b1300, 0x0, 0x0, 0x0, 0x0, 0x0)
src/io/ioutil/ioutil.go:42 +0x51
golang.org/x/image/tiff.Decode(0x7f23a2e992d8, 0xc820018420, 0x7f23a2e99438, 0xc820012480, 0x0, 0x0)
src/golang.org/x/image/tiff/reader.go:646 +0xf2a
main.main()
tiff.go:12 +0xf2
on commit eb11b45157c1b71f30b3cec66306f1cd779a689e
go version devel +3cab476 Sun Jun 21 03:11:01 2015 +0000 linux/amd64
This file looks corrupt to me.
The Mac Preview also has the same rendering problem.
I attached a screenshow of what the Preview looks like:
This Tiff has 6 tiles (150 x 100) and decoding chokes on the 3rd tile with x/image/tiff.
This looks consistent with what Mac Preview renders.
The actual reason: corrupt lzw data
@hhrutter, if the file is corrupt, the tiff
package should return an error, not provoke a panic.
Decoding this image first hits this line:
https://github.com/golang/image/blob/cff245a6509b8c4de022d0d5b9037c503c5989d6/tiff/lzw/reader.go#L213
And next code decoding hits this:
https://github.com/golang/image/blob/cff245a6509b8c4de022d0d5b9037c503c5989d6/tiff/lzw/reader.go#L187
Using decoderInvalidCode = 0xffff
as c
and d.prefix
length is only 4096.
I trying to create simple test without adding corrupted file to repo, but this is complicated case and I have not succeeded yet.
Maybe in this case is ok to add file to repo?
P.S.
I think compress/lzw
is also affected.
@bsiegert may be able to give some input.
he is one of the maintainers of x/image/tiff
Having corrupted image files as part of the test suite seems sensible, if for no other reason than to test the error paths.
I think
compress/lzw
is also affected.
I don't think that compress/lzw
in the stdlib is also affected. The golang.org/x/image/tiff/lzw
variant has this line:
if d.hi+1 >= d.overflow { // NOTE: the "+1" is where TIFF's LZW differs from the standard algorithm.
and, crucially, the +1 isn't in the stdlib's compress/lzw
. In the stdlib variant, we don't hit the d.last = decoderInvalidCode
line until d.hi
hits 0x1000
, not 0x0FFF
in the x/image variant. Furthermore, the code
variable is masked with (1<<d.width - 1)
, which maxes out at 0x0FFF
. So, in the stdlib, we shouldn't get into the if code == d.hi
block with a bad (out of bounds) d.last
...
I'm still thinking about how best to fix the x/image/variant. To repeat what bcmills@ said, we should definitely return an error, not panic.
Ah, the stdlib's compress/lzw/reader.go
and the x lib's golang.org/x/image/tiff/lzw/reader.go
are meant to be identical (bar a couple of explicitly called out lines), but there were a couple of 2017 era patches to the stdlib that I forgot to also apply to the x lib:
https://go-review.googlesource.com/c/go/+/42032/
https://go-review.googlesource.com/c/go/+/45111/
The second of these (45111) is trivial to copy across, and should fix the immediate problem. The first of these (42032) should also be copied across at some point, but it might need a little more thinking.
Change https://golang.org/cl/191221 mentions this issue: tiff/lzw: don't follow code == hi if last is invalid
I don't think that compress/lzw in the stdlib is also affected.
As for why the stdlib version now needs an explicit d.last != decoderInvalidCode
check, even though I argued an hour ago that it didn't need such a check... well, it used to not need such a check, but the 42032 changed introduced the need, due to a new d.hi--
line of code.
Anyway, as I said, 42032 should also be copied across, but it needs a little more thinking.
I have a question. If stdlib and x/iamge version should be the same except +1 thing, and we struggling to apply patches to both versions, why we can't remove x/image version and add something like NewTIFFReader()
and one if
to stdlib version? The performance penalty of one if
is not that big to justify support of 2 different versions.
why we can't remove x/image version and add something like NewTIFFReader() and one if to stdlib version? The performance penalty of one if is not that big to justify support of 2 different versions.
The performance penalty would have to be measured. It's not just one "if", it's one "if" in the inner loop.
But in general, putting TIFF's off-by-one into the stdlib's compress/lzw
is discussed in #25409