heterodb/pg-strom

gpu_cache使用時、UPDATEの後DELETE実行でsegmentation fault発生

Closed this issue · 3 comments

列aをUPDATEした後にDELETEを実行するとsegmentation faultでサーバプロセスが死ぬ。

UPDATE cache_test_table SET a = (a+1) % 127 WHERE a%97=0;
DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;

エラーメッセージ:

postgres=# DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

実行計画:

postgres=# explain DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Delete on cache_test_table  (cost=100.00..102.44 rows=0 width=0)
   ->  Custom Scan (GpuScan) on cache_test_table  (cost=100.00..102.44 rows=57 width=6)
         GPU Projection: ctid
         GPU Scan Quals: ((((a)::integer % 101) = 0) OR (a IS NULL)) [rows: 4000 -> 57]
         GPU Cache: GPU0 [phase: ready, max_num_rows: 10000]
(5 rows)

全体クエリ:

SET pg_strom.regression_test_mode = on;
SET client_min_messages = error;
DROP SCHEMA IF EXISTS gpu_cache_temp_test CASCADE;
CREATE SCHEMA gpu_cache_temp_test;
RESET client_min_messages;
SET search_path = gpu_cache_temp_test,public;
---
--- Creating a table on GPU cache
---
CREATE TABLE cache_test_table (
  id   int,
  a    int1
);
---
--- GPU Cache configuration
---
CREATE TRIGGER row_sync_test AFTER INSERT OR UPDATE OR DELETE ON cache_test_table FOR ROW 
    EXECUTE FUNCTION pgstrom.gpucache_sync_trigger('gpu_device_id=0,max_num_rows=10000,redo_buffer_size=150m,gpu_sync_threshold=10m,gpu_sync_interval=4');
ALTER TABLE cache_test_table ENABLE ALWAYS TRIGGER row_sync_test;
-- Make GPU cache 
INSERT INTO cache_test_table(id) values (1);
-- Check gpucache_info table.
SELECT config_options FROM pgstrom.gpucache_info WHERE table_name='cache_test_table' AND database_name=current_database();

TRUNCATE TABLE cache_test_table;
-- Force to use GPU Cache
SET enable_seqscan=off;
---
--- INSERT 
---
EXPLAIN (costs off, verbose)
INSERT INTO cache_test_table(id) values (1);


INSERT INTO cache_test_table (
  SELECT x 
  ,pgstrom.random_int(1,-128,127)     -- a int1
  FROM generate_series(1,4000) x
);
VACUUM ANALYZE;


UPDATE cache_test_table SET a = (a+1) % 127 WHERE a%97=0;
DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;

gdbバックトレース:

#0  0x00007f95062753d8 in pgstromExecResetTaskState () from /home/onishi/pgbin152/lib/postgresql/pg_strom.so
#1  0x0000000000666b5e in ExecReScan ()
#2  0x00000000006754a2 in EvalPlanQualNext ()
#3  0x00000000006758c6 in EvalPlanQual ()
#4  0x000000000069af9e in ExecDelete ()
#5  0x000000000069d041 in ExecModifyTable ()
#6  0x000000000067336b in standard_ExecutorRun ()
#7  0x00000000007c8e20 in ProcessQuery ()
#8  0x00000000007c97aa in PortalRunMulti ()
#9  0x00000000007c9bcf in PortalRun ()
#10 0x00000000007c62bf in exec_simple_query ()
#11 0x00000000007c6f28 in PostgresMain ()
#12 0x000000000075012b in ServerLoop ()
#13 0x0000000000751004 in PostmasterMain ()
#14 0x00000000004efe89 in main ()

INSERT→UPDATE→DELETEの順で実行すると上述のエラー発生しますが、
INSERT→DELETEだと発生しません。

DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;
DELETE 77

一方で、上記クエリを2回実行すると、"attempted to delete invisible tuple"というエラーが生じます。

DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;
DELETE 77
postgres=# DELETE FROM cache_test_table WHERE a%101=0 OR a IS NULL;
ERROR:  attempted to delete invisible tuple

346cd25ed15c74100bb2c02ac2563eab7bdc252f で修正しました。

まだ一度も実行していないCustomScanに対してExecReScan()が発行された結果、
最初の一回目で初期化すべきフィールドをチェックなしに触ってしまって、ぬるぽ⇒ガッでした。

修正確認しました。ありがとうございました。