ossc-db/pg_hint_plan

Doesn't work unless the execution order is serial

come25136 opened this issue · 9 comments

Hello!
I have noticed that in some cases when using prisma in node.js, for some reason the hint clause does not work.
When I run the same query in psql it works, but not when I run it from prisma($queryRaw).
Here is the log on the postgresql side when the query clause does not work.

2023-06-23 00:23:34.953 UTC [68] LOG:  duration: 1.246 ms  parse s21: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-23 00:23:34.956 UTC [68] LOG:  duration: 0.730 ms  bind s15: SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid
	FROM pg_catalog.pg_type t
	LEFT OUTER JOIN pg_catalog.pg_range r ON r.rngtypid = t.oid
	INNER JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid
	WHERE t.oid = $1

2023-06-23 00:23:34.956 UTC [68] DETAIL:  parameters: $1 = '16520'
2023-06-23 00:23:34.956 UTC [68] LOG:  execute s15: SELECT t.typname, t.typtype, t.typelem, r.rngsubtype, t.typbasetype, n.nspname, t.typrelid
	FROM pg_catalog.pg_type t
	LEFT OUTER JOIN pg_catalog.pg_range r ON r.rngtypid = t.oid
	INNER JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid
	WHERE t.oid = $1

2023-06-23 00:23:34.956 UTC [68] DETAIL:  parameters: $1 = '16520'
2023-06-23 00:23:34.956 UTC [68] LOG:  duration: 0.055 ms
2023-06-23 00:23:34.958 UTC [68] LOG:  duration: 0.351 ms  bind s17: SELECT enumlabel
	FROM pg_catalog.pg_enum
	WHERE enumtypid = $1
	ORDER BY enumsortorder

2023-06-23 00:23:34.958 UTC [68] DETAIL:  parameters: $1 = '16520'
2023-06-23 00:23:34.959 UTC [68] LOG:  execute s17: SELECT enumlabel
	FROM pg_catalog.pg_enum
	WHERE enumtypid = $1
	ORDER BY enumsortorder

2023-06-23 00:23:34.959 UTC [68] DETAIL:  parameters: $1 = '16520'
2023-06-23 00:23:34.959 UTC [68] LOG:  duration: 0.076 ms
2023-06-23 00:23:34.964 UTC [68] LOG:  duration: 3.661 ms  bind s21: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-23 00:23:34.964 UTC [68] LOG:  execute s21: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-23 00:23:35.319 UTC [68] LOG:  duration: 355.316 ms

The logs show that apparently queries are being issued in prisma other than the expected query.
This seems to be issued when selecting an enum column, and I found that the query clause is not working because of this.
The essential cause is not compatibility with prisma, but that there is another query between 2023-06-23 00:23:34.953 UTC [68] LOG: duration: 1.246 ms parse and 2023-06-23 00:23:34.964 UTC [68] LOG: duration: 3.661 ms and the order of execution is not serial.


For reference, here is the log when the query clause is working.
You can see that LOG: pg_hint_plan is being output.

2023-06-22 05:47:36.560 UTC [45] LOG:  duration: 9.351 ms  parse <unnamed>: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-22 05:47:36.567 UTC [45] LOG:  available indexes for IndexScan(oS): Stop_pkey
2023-06-22 05:47:36.567 UTC [45] STATEMENT:  /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-22 05:47:36.589 UTC [45] LOG:  pg_hint_plan:
	used hint:
	IndexScan(yyy)
	not used hint:
	Leading(xxx)
	duplication hint:
	error hint:

2023-06-22 05:47:36.589 UTC [45] STATEMENT:  /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-22 05:47:36.597 UTC [45] LOG:  duration: 35.569 ms  bind <unnamed>: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-22 05:47:36.598 UTC [45] LOG:  execute <unnamed>: /*+ Leading(xxx) IndexScan(yyy) */ SELECT ...
2023-06-22 05:47:37.013 UTC [45] LOG:  duration: 415.576 ms

name version
PostgresSQL PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
pg_hint_plan REL15_1_5_0
prisma 4.8.1

I used DeepL translation, so I may write strange things. I'm sorry.

MEMO

I tried enum select with pg_hint_plan.debug_print = verbose and found that LOG: pg_hint_plan[qno=0x9]: planner: no valid hint was output for all queries.
This is not a misdescription, but actually qno=0x9 for all of them.


When run from psql, qno was incremented correctly.
pg_hint_plan[qno=0x2e]: planner

My guess, following the code from the logs, is that it's probably coming here.

goto standard_planner_proc;

or
goto standard_planner_proc;

I can't write c lang and am not a postgres expert, so I'm closing the investigation once and for all.
If anyone can help me, I'd be super grateful for your help.

p.s. We'll investigate when we can afford it, but don't expect much.

I have increased the debug log and found that current_hint_str is null.

if (!current_hint_str)

I have increased the debug log and found that current_hint_retrieved is true.

if (current_hint_retrieved)

I observed that plpgsql_recurse_level is 0.

if (plpgsql_recurse_level > 0)

From the behavior, I guess that "current_hint_retrieved" does not seem to be reset (assigned false).

Not sure if this is the correct fix, but I have found that adding current_hint_retrieved = false; to the line below current_hint_state = NULL; is a workaround.

// Modified code

    msgqno = qno;
}
current_hint_state = NULL;
current_hint_retrieved = false; // ← add code
if (prev_planner)
    result =  (*prev_planner) (parse, query_string,

pg_hint_plan/pg_hint_plan.c

Lines 3184 to 3188 in 733b418

msgqno = qno;
}
current_hint_state = NULL;
if (prev_planner)
result = (*prev_planner) (parse, query_string,


I have pushed this workaround to my repository.
come25136@76df17f

I was looking at what you have here, and this looks incorrect to me, as current_hint_retrieved's fate is now linked to current_hint_str but not current_hint_state which is what you are attempting to enforce here. It is true that the logic of the code between the state, this boolean and the string involved depending on the depth of the calling stack is confusing.

Anyway, this issue is a duplicate of #145 which is presenting a different approach to reset the flag using an executor hook. So I am closing this one as a duplicate as the patch posted there looks more promising.