The tree of where clause does not seem to be creating correctly
Closed this issue · 3 comments
debedb commented
Consider the following code:
from mo_sql_parsing import parse, format
import json
where = {
"or": [
{
"and": [
{
"gt": [
"salary",
1000
]
},
{
"between": [
"e.hired_date",
{
"literal": "2020-01-01"
},
{
"literal": "2020-12-31"
}
]
}
]
},
{
"and": [
{
"lt": [
"salary",
1000
]
},
{
"between": [
"e.hired_date",
{
"literal": "2021-01-01"
},
{
"literal": "2021-12-31"
}
]
}
]
}
]
}
ast = {
"select" : [
{
"name": "total_cnt",
"value": {
"count": "*"
}
}
],
"from": [
{
"value": "foo"
}
],
"where" : where
}
print(json.dumps(ast, indent=2))
query = format(ast)
print(query)
The AST provided is
{
"select": [
{
"name": "total_cnt",
"value": {
"count": "*"
}
}
],
"from": [
{
"value": "foo"
}
],
"where": {
"or": [
{
"and": [
{
"gt": [
"salary",
1000
]
},
{
"between": [
"e.hired_date",
{
"literal": "2020-01-01"
},
{
"literal": "2020-12-31"
}
]
}
]
},
{
"and": [
{
"lt": [
"salary",
1000
]
},
{
"between": [
"e.hired_date",
{
"literal": "2021-01-01"
},
{
"literal": "2021-12-31"
}
]
}
]
}
]
}
}
The output is
SELECT COUNT(*) AS total_cnt
FROM foo
WHERE salary > 1000 AND e.hired_date BETWEEN '2020-01-01' AND '2020-12-31'
OR
(salary < 1000 AND e.hired_date BETWEEN '2021-01-01' AND '2021-12-31')
Yet I expect the output to be
SELECT COUNT(*) AS total_cnt
FROM foo
WHERE (salary > 1000 AND e.hired_date BETWEEN '2020-01-01' AND '2020-12-31' )
OR
(salary < 1000 AND e.hired_date BETWEEN '2021-01-01' AND '2021-12-31')
debedb commented
- While
AND
does have higher precedence, the problem would be still there if the operators are reversed - A simple but inelegant solution is to wrap every binary op in parentheses of course.
klahnakoski commented
I am not sure if there is a problem here. The formatting functions track the operator precedence to minimized the number of parentheses generated. If you reverse the operators, parenthesis will be included.
I ran your ast through the formatter to get a single line with no parenthesis, and it appears correct.
SELECT COUNT(*) AS total_cnt FROM foo WHERE salary > 1000 AND e.hired_date BETWEEN '2020-01-01' AND '2020-12-31' OR salary < 1000 AND e.hired_date BETWEEN '2021-01-01' AND '2021-12-31'
debedb commented
@klahnakoski yes, you are correct; I have tried reversing AND
and OR
and the proper paretheses are there. My mistake. Closing.