postgresml/pgcat

Large results are very slow in pgcat

Opened this issue · 4 comments

Describe the bug
This only gets ~40TPS with the pgcat.minimal.config

pgbench -P 1  -r -h 127.0.0.1 -T 100 --select-only -p 6434 --protocol simple -c 1 -f <(echo "select '"$(seq 10000)"'")
pgbench (15.3, server 12.16 (Ubuntu 12.16-1.pgdg22.04+1))
starting vacuum...end.
progress: 1.0 s, 44.0 tps, lat 20.902 ms stddev 24.041, 0 failed
progress: 2.0 s, 37.0 tps, lat 27.298 ms stddev 24.211, 0 failed
progress: 3.0 s, 36.0 tps, lat 27.768 ms stddev 24.246, 0 failed
progress: 4.0 s, 39.0 tps, lat 25.638 ms stddev 24.298, 0 failed
progress: 5.0 s, 42.0 tps, lat 23.807 ms stddev 24.108, 0 failed
progress: 6.0 s, 28.0 tps, lat 35.717 ms stddev 21.997, 0 failed

But doing the same size query, while not returning something big is much faster:

pgbench -P 1  -r -h 127.0.0.1 -T 100 --select-only -p 6434 --protocol simple -c 1 -f <(echo "select 1 where '' != '"$(seq 10000)"'")
pgbench (15.3, server 12.16 (Ubuntu 12.16-1.pgdg22.04+1))
starting vacuum...end.
progress: 1.0 s, 1602.8 tps, lat 0.591 ms stddev 0.183, 0 failed
progress: 2.0 s, 1622.0 tps, lat 0.616 ms stddev 0.222, 0 failed
progress: 3.0 s, 1577.0 tps, lat 0.632 ms stddev 0.341, 0 failed
progress: 4.0 s, 1645.0 tps, lat 0.607 ms stddev 0.210, 0 failed

Isn't this expected? Why should we be able to return large result sets at the same speed as small ones?

Some slowdown is expected, but not 40x slower. Up to 2x makes sense, since total traffic is multiplied by 2. PgBouncer and Odyssey don't have this huge slowdown.

In production we see something similar. On queries that return either large rows or high row counts we see a 2-5x increase in latency over pgbouncer. Were still tweaking things but its been pretty consistent for us. We running docker pgcat:e1e4929d439313d987c352b4517a6d99627f3e9c

Showing 95th latencies for a query that return 200-500 rows. the left is pgbouncer and the right is pgcat
Screenshot 2024-02-01 at 1 52 41 PM