golang/go

compress/flate: hang

dvyukov opened this issue · 2 comments

The following program:

package main

import (
    "bytes"
    "compress/flate"
    "encoding/hex"
    "io/ioutil"
)

func main() {
    data, _ := hex.DecodeString("344c4a4e494d4b070000ff2e2eff2e2e2e2e2eff")
    _, err := ioutil.ReadAll(flate.NewReader(bytes.NewReader(data)))
    if _, ok := err.(flate.InternalError); ok {
        panic(err)
    }
}

hangs at:

#0  compress/flate.(*decompressor).huffSym (f=0xc208070000, h=0xc208070030, ~r1=0, ~r2=...)
    at /ssd/src/go10/src/compress/flate/inflate.go:645
#1  0x0000000000454a59 in compress/flate.(*decompressor).huffmanBlock (f=0xc208070000)
    at /ssd/src/go10/src/compress/flate/inflate.go:406
#2  0x0000000000453f03 in compress/flate.(*decompressor).Read (f=0xc208070000, b= []uint8 = {...}, 
    ~r1=32768, ~r2=...) at /ssd/src/go10/src/compress/flate/inflate.go:286
#3  0x000000000044f7ec in bytes.(*Buffer).ReadFrom (b=0xc208046070, r=..., n=386662400, err=...)
    at /ssd/src/go10/src/bytes/buffer.go:173
#4  0x0000000000457f0f in io/ioutil.readAll (r=..., capacity=512, b= []uint8, err=...)
    at /ssd/src/go10/src/io/ioutil/ioutil.go:33
#5  0x0000000000458028 in io/ioutil.ReadAll (r=..., ~r1= []uint8, ~r2=...)
    at /ssd/src/go10/src/io/ioutil/ioutil.go:42
#6  0x0000000000400f42 in main.main () at /tmp/flate.go:12

I am on commit a02d925

It seems there's two issues with that DEFLATE sequence. One is that the dynamic Huffman block header is bogus, and two is that the dynamic Huffman block body is bogus.

The first issue causes f.codebits to become [3 0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 5 4 4], which when passed to h.init instructs it to assign a 1-bit code to 7, a 2-bit code to 8, a 3-bit code to 0, two 4-bit code to 17 and 18, and a 5-bit code to 16. However, by time we get to 16, there aren't any prefix-free codes left to assign.

What compress/flate currently does in this case is it assigns "0" to 7 and "00000" to 16. (It actually assigns 0x20 to 16, but when it gets reversed and cropped to 5 bits, it becomes all zeros.) This results in a sort of lookahead decoding behavior: if a 0 bit happens to be followed by 4 more 0 bits, then it will be decoded as 16, otherwise it gets decoded as 7. (Notably, 16 only takes priority over 7 like this because it comes later in the "for i, n := range bits" loop in (*huffmanDecoder).init.)

The second issue is as I described in https://go-review.googlesource.com/#/c/8893/: "Dmitry's test case is using dynamic huffman coding, but with a non-optimal encoding. There are only 16 symbols, but it's assigning length 7 and 8 codes to each of them, which leaves a bunch of invalid prefixes. Later when one of those prefixes is used, the corresponding h.chunks entry will still be 0, which huffSym will misinterpret to mean it found a 0 length code."

Finding a 0 length code causes it to consume 0 bits of input and then try again, which of course means it will find the same code again and again.`

@rsc @adg Will there be another stable release before 1.5? If so, can I suggest backporting 5f0ac4a?