sql: SELECT ("*") parse oddities
tbg opened this issue · 4 comments
SELECT ("*")
reproduces as SELECT (*)
, which isn't valid SQL.
On the other hand,
SELECT FROM t WHERE a = COUNT(*)
is valid SQL.
In both cases, *
is a QualifiedName{"a"}
but for the first, the corresponding String()
must be "*"
(not *
).
I've looked around for an obvious fix for this, but I'm thinking this is something that should be done at the grammar level by having *
as a keyword of some sorts.
this, again, found by fuzzing. ping @dvyukov.
edit: even easier: SELECT "*"
-> SELECT *
.
I've tried to get to the bottom of this, but it's rabbit hole-ish. Basically the issue is that we have StarExpr
(which is the *
in SELECT *
), but StarExpr
isn't even an Expr
so far. We use QualifiedName
in most places, in particular SELECT foo.*
has foo.*
as a QualifiedName{"foo", "*"}
so all the starsyness is completely lost. Looks like without refactoring the grammar with respect to *
handling we can't really resolve this.
Maybe StarExpr
and NonStarExpr
should go, and instead QualifiedName
should contain that information (the only crucial point is that "(ident|empty)*"
and (ident|empty)*
are distinguishable). If *
were not an allowed column name, we'd be home free but Postgres allows that.
ping @petermattis
This isn't so much the grammar (which is parsing things fine), but the generated syntax tree. QualifiedName
(which is just a []string
) is inadequate at capturing some of the complexities. In particular, *
is being treated as a string element where it probably deserves special handling.
something like this?
type QualifiedName interface {
qualifiedName()
}
// with implementers:
type StarExpr struct {
NonStarExpr // by convention, with Column() = ""
}
type NonStarExpr {
// holds and exposes database, table, column
}
I don't think this is sufficient. This is an area of the postgres grammar that was significantly more complex than the vitess grammar. See the indirection
and indirection_elem
rules in sql.y
. It looks like foo.*.*
is legal for the grammar. I'm not sure where that would even be used (foo.*
is usually all columns from the table foo
). We're also not properly generating the syntax tree for array access, but I doubt we'll be supporting arrays any time soon.
Seems like QualifiedName
needs to be something like []QualifiedElem
where QualifiedElem
is an interface
:
type QualifiedElem interface {
qualifiedElem()
String() string
}
func (Name) qualifiedElem() {}
func (Star) qualifiedElem() {}
func (*Subscript) qualifiedElem() {}
type Name string
func (n Name) string() {
if _, ok := keywords[strings.ToUpper(string(n))]; ok {
return fmt.Sprintf(`"%s"`, n)
}
return encIdent(n)
}
type Star string
func (Star) String() string {
return "*"
}