citusdata/cstore_fdw

performance issue with views and functions.

xvaara opened this issue · 2 comments

Hi, I was testing cstore_fdw against plain pg table and noticed this:

pg=# create foreign table cstore_test (time timestamp,meter_id int,data real) server cstore_server options(compression'pglz');
CREATE FOREIGN TABLE
Time: 31.434 ms

pg=# INSERT INTO cstore_test SELECT "time", FLOOR(RANDOM() * 10), 10- (RANDOM() * 20) FROM GENERATE_SERIES(NOW() - INTERVAL '1 year', NOW(), INTERVAL '1 seconds') TIME;
INSERT 0 31536001
Time: 41724.510 ms (00:41.725)

pg=#  CREATE OR REPLACE FUNCTION increment(i real) RETURNS real AS $$
pg$#         BEGIN
pg$#                 RETURN i + 1;
pg$#         END;
pg$# $$ LANGUAGE plpgsql;
CREATE FUNCTION
Time: 41.641 ms
sirate=# create view cstore_view as select time,meter_id,increment(data) from cstore_test;
CREATE VIEW
Time: 63.024 ms

pg=# select time,meter_id,increment(data) from cstore_test where time > '2018-03-16 23:55:00' and meter_id = 1;
            time            | meter_id |  increment
----------------------------+----------+--------------
 2018-03-18 04:40:57.048955 |        1 |       9.6884
 2018-03-18 04:41:21.048955 |        1 |     -4.72585
 2018-03-18 04:41:31.048955 |        1 |      2.50237
 2018-03-18 04:41:34.048955 |        1 |     -3.65364
 2018-03-18 04:41:35.048955 |        1 |     -0.16955
 2018-03-18 04:42:07.048955 |        1 |     -5.81057
 2018-03-18 04:42:13.048955 |        1 |      5.37638
 2018-03-18 04:42:23.048955 |        1 |     -2.19258
Time: 127.635 ms

pg=# select * from cstore_view where time > '2018-03-16 23:55:00' and meter_id = 1;
            time            | meter_id |     data
----------------------------+----------+--------------
 2018-03-18 04:40:57.048955 |        1 |       9.6884
 2018-03-18 04:41:21.048955 |        1 |     -4.72585
 2018-03-18 04:41:31.048955 |        1 |      2.50237
 2018-03-18 04:41:34.048955 |        1 |     -3.65364
 2018-03-18 04:41:35.048955 |        1 |     -0.16955
 2018-03-18 04:42:07.048955 |        1 |     -5.81057
 2018-03-18 04:42:13.048955 |        1 |      5.37638
 2018-03-18 04:42:23.048955 |        1 |     -2.19258
Time: 2784.645 ms (00:02.785)

And after I inserted another 31M rows the times got twice as bad: Time: 5239.377 ms (00:05.239).

If using vanilla tables it's about the same time using query or view. Time: 4424.318 ms (00:04.424)

I think that is related to how views are planned. cstore_fdw might be returning all rows instead of filtering at stripe/block level.

similar issue is reported by another user for queries inside functions. He confirmed that after marking as the function as STABLE it worked as expected.