GeertJohan/go.rice

virtualfile.read can lead to infinite loops

nkovacs opened this issue · 8 comments

I'm using rice with https://github.com/mattes/migrate. It has a pretty complicated combination of buffered readers, pipes and bytes.Buffer, and I am getting an infinite loop, consuming all memory.

The problem is with the implementation of virtualfile.read. If the buffer is larger than the remaining contents of the file, you return the remaining bytes and an io.EOF. It looks like some reader wasn't prepared for that, and expects more content if it gets some bytes. Because you reset vf.offset to zero, the next read will again read some bytes, and this ended up being an infinite loop that consumes all available memory: https://github.com/GeertJohan/go.rice/blob/master/virtual.go#L86

While investigating this, I found a related problem. This will output the file contents twice:

package main

import (
	"bufio"
	"fmt"
	"io/ioutil"

	rice "github.com/GeertJohan/go.rice"
)

var DefaultBufferSize = uint(100000)

func main() {
	box, err := rice.FindBox("example-files")
	if err != nil {
		panic(err)
	}
	f, err := box.Open("test.txt")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	b := bufio.NewReaderSize(f, int(DefaultBufferSize))
	b.Peek(int(DefaultBufferSize))

	bs, err := ioutil.ReadAll(b)
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", bs)
}

test.txt is:

this is a test

virtualDir.readdir is also wrong.

If n is <= 0, os.Readdir returns all the remaining files and never returns an io.EOF. rice's virtualDir.readdir always returns all files (ignoring the offset, and even resetting the offset), and returns an io.EOF.
if n > 0, os.Readdir only returns an io.EOF if there are no remaining files (so it doesn't return io.EOF if it reached the end of the directory, but there were files, see https://golang.org/src/os/dir_unix.go#L37), and it doesn't reset.

Can you bit more specific how you triggered these conditions? I cannot reproduce any of them. :(

The reader (in mattes/migrate) wasn't implemented in the most robust way, and if it receives some bytes, it tries to read again, it doesn't check for the io.EOF error. The correct behavior from rice would be to return zero bytes and an io.EOF again on the subsequent read. Instead it resets the offset to zero and starts returning the file from the start again, leading to an infinite loop.

You can reproduce it by writing a for loop that reads until io.EOF, then doing another read. That last read should return 0 bytes and io.EOF. Instead it returns the start of the file again.

I've fixed these bugs in my fork: nkovacs@341d7a8, nkovacs@96d5b2a

I would be happy to merge those fixes upstream!

@nkovacs Do you want create a PR for those? Or shall I just pull them in manually?
Also, I've sent you a mail, hope it didn't end up in spam ;)

This virtualfile.Read issue looks very similar to the one we faced too, I've also submitted a PR with a fix here #159.

Heh, after digging in deep to debug an infinite loop in arduino-builder, I found the problem is in the rice read function, find this bug report and see that @cmaglie has already done (probably) exactly the same... But with a fix, just tested it and it works perfectly :-D

I faced the exact issue. Because go-migrate is so complicated, I initially thought it was some edge case on their side. But after spending many hours reading the related logic, I managed to reduce the reproducible code to contain only go.rice. Then I found this issue...

I tried just using Peek and ReadAll like @nkovacs did, but I'm not able to reproduce this issue as well. However it's very easy to reproduce using an io.Pipe (just like how go-migrate does), @GeertJohan you can use this repository to test out.