vitessio/vitess

go/vt/sqlparser: index out of range

dvyukov opened this issue · 6 comments

The following program crashes with the panic:

package main

import (
    "github.com/youtube/vitess/go/vt/sqlparser"
)

func main() {
    sql := "SET o=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0=t(0"
    sqlparser.Parse(sql)
}
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/youtube/vitess/go/vt/sqlparser.yyParse(0x7fd9d9e3bc08, 0xc208066000, 0x662df8)
    src/github.com/youtube/vitess/go/vt/sqlparser/yaccpar:207 +0xd7c1
github.com/youtube/vitess/go/vt/sqlparser.Parse(0x5cd9f0, 0xc9, 0x0, 0x0, 0x0, 0x0)
    src/github.com/youtube/vitess/go/vt/sqlparser/ast.go:31 +0xa0
main.main()
    sql.go:9 +0x40

on commit

discovered with https://github.com/dvyukov/go-fuzz

@sougou This appears to be a bug in the yacc-generated code. Is there anything we can do? Try a newer version of yacc?

I've been meaning to look at it once I'm done with my benchmark work. Looking at the test, it most likely is a stack overflow due high level of nesting. I'll either have to change to grammar to reduce more aggressively rather than shifting, or limit the nesting count and return a meaningful error.
I don't think yacc has changed in 20 years :).

Well, the "go tool yacc" version has been improving: https://github.com/golang/go/commits/master/src/cmd/yacc :)

Here are two more reproducers with slightly different stacks (one of them is probably the same as original):

"SELECT(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "B,:B,:A\t(F("
"SELECT(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(F(F(F(F(F(F(F(F(F(F" +
    "(0"

I don't know what changed, but I wasn't able to reproduce this from the latest pull. In any case, I have a CL that limits nesting of anything parenthesized to a fixed number. I'll hard code that to 200.
It's possible that there are other nestable constructs in the grammar that don't use parenthesis. We can cross those bridges as they come up.