DISTINCT statement on multiple columns
Closed this issue · 2 comments
Hi,
Thanks for creating this project! I've been using it for a while and loved it.
However, I noticed a potential issue with SELECT DISTINCT
statement with multiple columns:
For example:
In [2]: print(json.dumps(parse("SELECT DISTINCT Country, City FROM Customers;"), indent=4))
Out[2]:
{
"select": [
{
"value": {
"distinct": "Country"
}
},
{
"value": "City"
}
],
"from": "Customers"
}
The SQL means that we want to find all distinct country & city pairs, but from the parsed result it looks like distinct
statement is only attached to Country
, which IMO is kind of counterintuitive. Currently I have to look into the first value and see if it has distinct
as a key
.
What do we think about parsing the above query into the following format, so that all columns are under the distinct
statement?
{
"select": [
{
"value": {
"distinct": [
{
"value": "Country"
},
{
"value": "City"
}
]
}
}
],
"from": "Customers"
}
Thanks!
Thank you for pointing this out. I will add tests for this.
It is likely that select distinct a, b
will result in
{
"select_distinct":[{"value":"a"}, {"value":"b"}]
}
...just another clause