postgres_attach fails with SQL syntax error when postgres contains table with column of type "array of enum"
Closed this issue · 7 comments
What happens?
When you have table in postgres, containing a column whose type is "array of enum",
running postgres_attach fails with error like this:
Error: IO Error: Unable to query Postgres: ERROR: syntax error at or near "."
LINE 1: SELECT unnest(enum_range(NULL::.myenum))
^
To Reproduce
- Start latest version (15.2) of postgresql locally in docker (but it's also reproducible with older postgres, like 12.11)
docker run \
-p 5432:5432 \
-e POSTGRES_USER=test \
-e POSTGRES_PASSWORD=test \
-e POSTGRES_DB=test \
-e PGDATA=/var/lib/postgresql/data/pgdata \
--tmpfs /var/lib/postgresql/data/pgdata \
postgres:15.2-alpine \
-c log_statement=all- Connect to test db using psql
psql "host=localhost user=test password=test"and init simple DB schema using following statements (note the column whose type is array of some enum)
CREATE TYPE myenum as ENUM ('whatever');
CREATE TABLE test (array_of_myenum myenum[]);- Start duckdb from CLI and try postgres_attach
~/Tmp> duckdb
v0.7.0 f7827396d7
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D LOAD postgres;
D CALL postgres_attach('host=localhost user=test password=test');
Error: IO Error: Unable to query Postgres: ERROR: syntax error at or near "."
LINE 1: SELECT unnest(enum_range(NULL::.myenum))
^
ERROR: syntax error at or near "."
LINE 1: SELECT unnest(enum_range(NULL::.myenum))
^OS:
Linux - Fedora 36
PostgreSQL Version:
reproduced in 12.11 and 15.2
DuckDB Version:
v0.7.0 f7827396d7
DuckDB Client:
duckdb cli
Full Name:
Jan Hrček
Affiliation:
Holmusk
Have you tried this on the latest master branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
I'm also running into this -- quite a lot of our arrays are enum types ha.
@jhrcek did you ever figure out a work around?
No I didn't yet figure it out.
For now I'm just "importing" individual tables that don't have enum-array columns, using postgres_scan https://duckdb.org/docs/guides/import/query_postgres.html
Gotcha. Thanks for opening this @jhrcek. This is somewhat if a blocker for a project so I'll try to open a PR 😄 .
I think I've found the issue. Array types go down this code path:
// TODO better check, does the typtyp say something here?
// postgres array types start with an _
if (StringUtil::StartsWith(pgtypename, "_")) {
return LogicalType::LIST(DuckDBType2(ele_info, atttypmod, nullptr, conn, context));
}
The ele_info value is being passed as DuckDBType2's type_info argument. A few lines down, when the new instance of DuckDBType2 reaches enum unnesting it fails because the type_info->nspname is empty:
if (type_info->typtype == "e") { // ENUM
auto res = PGQuery(
conn, StringUtil::Format("SELECT unnest(enum_range(NULL::%s.%s))", type_info->nspname, type_info->typname));
Vector duckdb_levels(LogicalType::VARCHAR, res->Count());
for (idx_t row = 0; row < res->Count(); row++) {
duckdb_levels.SetValue(row, res->GetString(row, 0));
}
The issue is that the ele_info object is not getting nspname set initially here:
info.elem_info.typname = res->GetString(row, 7);
info.elem_info.typlen = res->GetInt64(row, 8);
info.elem_info.typtype = res->GetString(row, 9);
Ultimately, this should be the fix but I will confirm in Discord:
info.elem_info.nspname = res->GetString(row, 2);
info.elem_info.typname = res->GetString(row, 7);
info.elem_info.typlen = res->GetInt64(row, 8);
info.elem_info.typtype = res->GetString(row, 9);
Gotcha. Thanks for opening this @jhrcek. This is somewhat if a blocker for a project so I'll try to open a PR 😄 .
I think I've found the issue. Array types go down this code path:
// TODO better check, does the typtyp say something here? // postgres array types start with an _ if (StringUtil::StartsWith(pgtypename, "_")) { return LogicalType::LIST(DuckDBType2(ele_info, atttypmod, nullptr, conn, context)); }The
ele_infovalue is being passed as DuckDBType2'stype_infoargument. A few lines down, when the new instance of DuckDBType2 reaches enum unnesting it fails because thetype_info->nspnameis empty:if (type_info->typtype == "e") { // ENUM auto res = PGQuery( conn, StringUtil::Format("SELECT unnest(enum_range(NULL::%s.%s))", type_info->nspname, type_info->typname)); Vector duckdb_levels(LogicalType::VARCHAR, res->Count()); for (idx_t row = 0; row < res->Count(); row++) { duckdb_levels.SetValue(row, res->GetString(row, 0)); }The issue is that the
ele_infoobject is not gettingnspnameset initially here:info.elem_info.typname = res->GetString(row, 7); info.elem_info.typlen = res->GetInt64(row, 8); info.elem_info.typtype = res->GetString(row, 9);Ultimately, this should be the fix but I will confirm in Discord:
info.elem_info.nspname = res->GetString(row, 2); info.elem_info.typname = res->GetString(row, 7); info.elem_info.typlen = res->GetInt64(row, 8); info.elem_info.typtype = res->GetString(row, 9);
That sounds like it could fix the issue indeed - happy to review a PR.
Appreciate the quick response @Mytherin. I'll follow up with a PR shortly. Cheers!
This should be fixed now