feat: provide performance hints
nicoabie opened this issue · 6 comments
at compile time we could analyze queries against a prod instance and provide useful insights to the user. like the need to create indexes.
more on the next comment
More information regarding the idea.
It would be to include an optimization check into the Compile Time check. My recommendation would be some modifications in the Type Generation to include additional metadata that can be discarded after the compile time check is done. I'd recommend leaving the scope to single column indexes while ironing it out, and then maybe target multi-column indexes. Single Column indexes are very common, so having the DB Library recommend them shouldn't be too bad.
Do note that Postgres states there is no silver bullet for Indexes, in an ideal world we'd probably hook into a production DB, check if some variant of ANALYZE
exists, and run it for all the queries. Instead this feature is to suggest good places for developers less familiar with DBA where to look.
A practical example should help showcase the idea. We'll have two tables to support a Customer/Order Relationship.
CREATE TABLE customer (
id serial PRIMARY KEY,
name text NOT NULL,
age integer
);
CREATE TABLE order (
id serial PRIMARY KEY,
customer integer NOT NULL,
total varchar(64) NOT NULL DEFAULT '0.00',
currency varchar(3) NOT NULL DEFAULT 'USD',
FOREIGN KEY (customer)
REFERENCES customer (id)
);
Now lets add in some queries to reference. (Apologizes for any bad syntax, I rarely just yolo SQL like this)
-- $1 would be a customer ID. Query in general is "Get total order count by customer Id"
SELECT COUNT(order.id) FROM order INNER JOIN customer on order.customer=customer.id WHERE customer.id = $1
-- $1 is a currency code, Query in general is "Get average customer age by order currency"
SELECT AVG(customer.id) FROM customer JOIN order on order.customer=customer.id WHERE order.currency = $1
Hopefully that is enough to work with, since I can't easily come up with more examples quickly.
In this case, ojotas would recommend an index on the order.customer
column. We can see there is no index there, and both of the queries the application would run (in this small example) utilize a JOIN
on that column. This points towards an area to review for "free" performance.
Naturally any hint system should have an opt-out feature, since if you do the review and find no reason to utilize the tip, you don't want to see it anymore. I'd recommend the following addition to the ojotasrc.json
{
"hints": {
"enabled": boolean,
"ignore": [
{
"table": string,
"column": string?,
"hint_type": enum
}
]
}
}
Where if enabled
is true
, then ignore
factors in. If hints
is false
, then no hint scanning is done.
When hints are enabled, than a declared table, or table/column pair can be excluded from specific hints (Assuming more hints get added.
As an example, lets display a configuration option to disable hint scanning on our order
table, and then our order
/customer
Table Column Pair.
{
"hints": {
"enabled": true,
"ignore": [
{
"table": "order",
"column": null,
"hint_type": null
}
]
}
}
{
"hints": {
"enabled": true,
"ignore": [
{
"table": "order",
"column": "customer",
"hint_type": "INDEX"
}
]
}
}
column
could be adjusted to columns
if it is preferred to have a string[]
instead.
Another thing we could do related to this is that if we are able to analyze al the sql queries, we could suggest the user to create composite indexes that could satisfy many queries instead of having many secondary indexes.
Also we could detect cases of index obfuscation
Another idea, show a warning when using OR
in the ON
clause of a JOIN
Maybe suggest to use union all for each branch
video on index obfuscation https://planetscale.com/learn/courses/mysql-for-developers/queries/redundant-and-approximate-conditions?autoplay=1 and how to mitigate it