cstore_fdw should have better support for joins
mtuncer opened this issue · 1 comments
mtuncer commented
When cstore_fdw table is involved in a join, postgres pulls all data from cstore_fdw table. It makes queries unnecessarily longer.
We are getting more requests to make this work. I do not know the involved work at the moment. We should definitely investigate this for future enhancement.
tomvantilburg commented
Here's a quick setup to tell the story:
Create data
CREATE FOREIGN TABLE cstore_tmp (X integer,Y integer) SERVER cstore_server;
INSERT INTO cstore_tmp
SELECT s1, s2
FROM generate_series(1,3000) s1, generate_series(1,3000) s2;
Query with 1:N join (same as no join)
EXPLAIN SELECT count(b.*)
FROM (
SELECT 300 minx, 300 miny, 600 maxx, 600 maxy
) a
JOIN cstore_tmp b
ON b.x BETWEEN a.minx AND a.maxx
AND b.y BETWEEN a.miny AND a.maxy;
Result:
Aggregate (cost=189072.56..189072.57 rows=1 width=8)
-> Foreign Scan on cstore_tmp b (cost=0.00..189072.00 rows=225 width=32)
Filter: ((x >= 300) AND (x <= 600) AND (y >= 300) AND (y <= 600))
CStore File: /var/data/postgresql/10/main/cstore_fdw/16384/191674758
CStore File Size: 74310480
Query with N:N join
SELECT count(b.*)
FROM (
SELECT 300 minx, 300 miny, 500 maxx, 500 maxy
UNION ALL
SELECT 400 minx, 400 miny, 600 maxx, 600 maxy
) a
JOIN cstore_tmp b
ON b.x BETWEEN a.minx AND a.maxx
AND b.y BETWEEN a.miny AND a.maxy;
Result:
Aggregate (cost=504627.60..504627.61 rows=1 width=8)
-> Nested Loop (cost=0.00..504072.05 rows=222222 width=32)
Join Filter: ((b.x >= "*SELECT* 1".minx) AND (b.x <= "*SELECT* 1".maxx) AND (b.y >= "*SELECT* 1".miny) AND (b.y <= "*SELECT* 1".maxy))
-> Foreign Scan on cstore_tmp b (cost=0.00..99072.00 rows=9000000 width=40)
CStore File: /var/data/postgresql/10/main/cstore_fdw/16384/191674758
CStore File Size: 74310480
-> Materialize (cost=0.00..0.05 rows=2 width=16)
-> Append (cost=0.00..0.04 rows=2 width=16)
-> Subquery Scan on "*SELECT* 1" (cost=0.00..0.02 rows=1 width=16)
-> Result (cost=0.00..0.01 rows=1 width=16)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=16)
-> Result (cost=0.00..0.01 rows=1 width=16)