EnterpriseDB/mysql_fdw

Segmentation fault w/ dual left join is null

Closed this issue · 4 comments

Preliminaries:

  • mariadb arm64 docker image (11.0.2-MariaDB-1:11.0.2+maria~ubu2204 mariadb.org binary distribution)
  • mariadb Table test1 (id int primary key auto_increment, name text not null) (name is probably not needed)
  • postgres arm64 docker image w/ mysql_fdw (postgres:16-bookworm + postgresql-16-mysql-fdw, also see versions below)
  • postgres create schema ext1 + import foreign schema.
SELECT *
FROM ext1.test1 r1
LEFT JOIN ext1.test1 r2 ON r2.id=r1.id
LEFT JOIN ext1.test1 r3 ON r3.id=r1.id 
-- WHERE r2 IS NULL;   -- this does not crash
WHERE r3 IS NULL;  -- this segfaults postgres

(Yes, the r2 IS NULL (shorthand for r2.* IS NULL) syntax is uncommon, and not even supported by mysql/mariadb directly. This means it can't be pushed down to mysql and must be evaluated later... But why does the very same thing work for r2, but not for r3? Also, the much more common r3.id IS NULL does not crash.)

LOG:  starting PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
LOG:  listening on IPv4 address "0.0.0.0", port 5432
LOG:  listening on IPv6 address "::", port 5432
LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG:  database system was shut down at 2024-03-04 13:53:55 UTC
LOG:  database system is ready to accept connections
LOG:  server process (PID 46) was terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: select * from ext1.test1 r1 left join ext1.test1 r2 on r2.id=r1.id left join ext1.test1 r3 on r3.id=r1.id where r3 is null;
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing

mysql_fdw version:

  Origin URI: http://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-mysql-fdw/postgresql-16-mysql-fdw_2.9.1-2.pgdg120%2b1_arm64.deb

This crashes:

test=# explain verbose select * from ext1.test1 r1 left join ext1.test1 r2 on r2.id=r1.id left join ext1.test1 r3 on r3.id=r1.id where r3 is null;
                                                                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=15.00..35.00 rows=125 width=108)
   Output: r1.id, r1.name, r2.id, r2.name, r3.id, r3.name
   Filter: (r3.* IS NULL)
   Relations: ((test1.test1 r1) LEFT JOIN (test1.test1 r2)) LEFT JOIN (test1.test1 r3)
   Remote server startup cost: 25
   Remote query: SELECT r1.`id`, r1.`name`, r2.`id`, r2.`name`, r4.`id`, r4.`name` FROM ((`test1`.`test1` r1 LEFT JOIN `test1`.`test1` r2 ON (((r2.`id` = r1.`id`)))) LEFT JOIN `test1`.`test1` r4 ON (((r4.`id` = r1.`id`))))
(6 rows)

This works (although with a more complicated query plan):

# explain verbose select * from ext1.test1 r1 left join ext1.test1 r2 on r2.id=r1.id left join ext1.test1 r3 on r3.id=r1.id where r2 is null;
                                                                                            QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Right Join  (cost=40.00..1064.38 rows=125 width=108)
   Output: r1.id, r1.name, r2.id, r2.name, r3.id, r3.name
   Merge Cond: (r3.id = r1.id)
   ->  Foreign Scan on ext1.test1 r3  (cost=25.00..1025.00 rows=1000 width=36)
         Output: r3.id, r3.name
         Remote server startup cost: 25
         Remote query: SELECT `id`, `name` FROM `test1`.`test1` ORDER BY `id` IS NULL, `id` ASC
   ->  Materialize  (cost=15.00..35.06 rows=25 width=72)
         Output: r1.id, r1.name, r2.id, r2.name
         ->  Foreign Scan  (cost=15.00..35.00 rows=25 width=72)
               Output: r1.id, r1.name, r2.id, r2.name
               Filter: (r2.* IS NULL)
               Relations: (test1.test1 r1) LEFT JOIN (test1.test1 r2)
               Remote server startup cost: 25
               Remote query: SELECT r1.`id`, r1.`name`, r2.`id`, r2.`name` FROM (`test1`.`test1` r1 LEFT JOIN `test1`.`test1` r2 ON (((r2.`id` = r1.`id`)))) ORDER BY r1.`id` IS NULL, r1.`id` ASC
(15 rows)

This also works (query plan more similar to the crashing version):

# explain verbose select * from ext1.test1 r1 left join ext1.test1 r2 on r2.id=r1.id where r2.* is null;
                                                                   QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=15.00..35.00 rows=25 width=72)
   Output: r1.id, r1.name, r2.id, r2.name
   Filter: (r2.* IS NULL)
   Relations: (test1.test1 r1) LEFT JOIN (test1.test1 r2)
   Remote server startup cost: 25
   Remote query: SELECT r1.`id`, r1.`name`, r2.`id`, r2.`name` FROM (`test1`.`test1` r1 LEFT JOIN `test1`.`test1` r2 ON (((r2.`id` = r1.`id`))))
(6 rows)

Backtrace of the crashing query:

Program received signal SIGSEGV, Segmentation fault.
mysql_get_tuple_with_whole_row (nulls=0xaaab0dc8acc8, values=0xaaab0dc8ac80, festate=0xaaab0dc8a6f0) at ./mysql_fdw.c:3481
3481	./mysql_fdw.c: No such file or directory.
(gdb) bt
#0  mysql_get_tuple_with_whole_row (nulls=0xaaab0dc8acc8, values=0xaaab0dc8ac80, festate=0xaaab0dc8a6f0) at ./mysql_fdw.c:3481
#1  mysqlIterateForeignScan (node=<optimized out>) at ./mysql_fdw.c:856
#2  0x0000aaaacddc753c in ForeignNext (node=0xaaab0dc88ec0) at ./build/../src/backend/executor/nodeForeignscan.c:62
#3  0x0000aaaacddb82b4 in ExecScan (node=0xaaab0dc88ec0, accessMtd=0xaaaacddc74a0 <ForeignNext>, recheckMtd=0xaaaacddc7574 <ForeignRecheck>) at ./build/../src/backend/executor/execScan.c:198
#4  0x0000aaaacddae39c in ExecProcNode (node=0xaaab0dc88ec0) at ./build/../src/include/executor/executor.h:273
#5  ExecutePlan (execute_once=<optimized out>, dest=0xffff9e8cab28, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0xaaab0dc88ec0, estate=0xaaab0dc88c70)
    at ./build/../src/backend/executor/execMain.c:1670
#6  standard_ExecutorRun (queryDesc=0xaaab0dbb5af0, direction=<optimized out>, count=0, execute_once=<optimized out>) at ./build/../src/backend/executor/execMain.c:365
#7  0x0000aaaacdf61614 in PortalRunSelect (portal=portal@entry=0xaaab0dc00860, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0xffff9e8cab28) at ./build/../src/backend/tcop/pquery.c:924
#8  0x0000aaaacdf62c5c in PortalRun (portal=portal@entry=0xaaab0dc00860, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0xffff9e8cab28,
    altdest=altdest@entry=0xffff9e8cab28, qc=qc@entry=0xffffd5694818) at ./build/../src/backend/tcop/pquery.c:768
#9  0x0000aaaacdf5ea74 in exec_simple_query (query_string=query_string@entry=0xaaab0db89460 "select * from ext1.test1 r1 left join ext1.test1 r2 on r2.id=r1.id left join ext1.test1 r3 on r3.id=r1.id where r3 is null;")
    at ./build/../src/backend/tcop/postgres.c:1272
#10 0x0000aaaacdf5f7b4 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at ./build/../src/backend/tcop/postgres.c:4654
#11 0x0000aaaacdeccb54 in BackendRun (port=0xaaab0dbb6e70) at ./build/../src/backend/postmaster/postmaster.c:4464
#12 BackendStartup (port=0xaaab0dbb6e70) at ./build/../src/backend/postmaster/postmaster.c:4192
#13 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1782
#14 0x0000aaaacdecdb34 in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0xaaab0db43960) at ./build/../src/backend/postmaster/postmaster.c:1466
#15 0x0000aaaacdbdb734 in main (argc=1, argv=0xaaab0db43960) at ./build/../src/backend/main/main.c:198

Hi @smilingthax,

Thanks for reporting this issue. I can reproduce the same at my end.
We will take this further and try to fix it.

Hi @smilingthax,

We have fixed this issue under commit - 3548b33. Please verify the same at your end and share your feedback. Thanks again for reporting this issue.

I can confirm that my original query (not just the simplified version I gave here) no longer crashes with current git HEAD.

(I did not yet thoroughly check that in all cases the correct result is obtained, though.)

Thanks for the confirmation. We will close this case. If you face any issues further then you can reopen or create new issue anytime.