Consuming IO messages across decoder buffer boundary corrupts decoder state
Closed this issue · 1 comments
One of the constraints I made when writing the styxproto
package was that it use a fixed-size buffer per connection. Another, perhaps unnecessary constraint was to support very small (~.5KB) and very large (2GB) msize parameters, independent of the internal buffer size. One consequence that came out of these constraints is that the two largest messages, Rread
and Twrite
, do not support random access. Instead they implement the io.Reader
interface.
When one of these messages does not fit completely in the Decoder buffer, io.MultiReader
is used to combine the buffered data (using a bytes.Reader
) with an *io.LimitedReader
that consumes from the underlying connection, like so:
m.r = bytes.NewReader(buffered)
if int64(len(buffered)) < count {
m.r = io.MultiReader(
m.r,
io.LimitReader(r, count-int64(len(buffered))))
}
When I wrote this, I thought I was being clever. Unfortunately, as far as I can tell, it's never worked. This would only manifest when reading or writing large files.
After doing a bit of troubleshooting, I believe the problem is in decoder.go
, here:
func (s *Decoder) Next() bool {
if s.msg != nil {
s.err = discard(s.br, s.msg.Len())
discard()
gets rid of msg.Len() bytes from the bufio.Reader
, but the underlying connection has already been advanced by msg.Len() - len(buffered)
bytes.
One complication is that this only occurs when the Read
method is called on one of these messages. If all messages were fully consumed, this would be as simple as replacing discard(s.br, s.msg.Len())
with discard(s.br, len(s.msg.bytes()))
. But since we can't guarantee that, we have to preface it with io.Copy(ioutil.Discard, s.msg)
.