pgcat decreases TPS from unproxied posgresql 16 by 4x to 5x (docker-compose reproducer)
Closed this issue · 3 comments
Describe the bug
Performance of pgcat unexpectedly low on x86 hardware, but not on Apple Silicon.
Platform | baseline 10 | pgcat 10 | pgcat 100 |
---|---|---|---|
m1max | 154 | 624 | 577 |
ryzen 5900x wsl | 133 | 19 | 19 |
i7-8750H ubuntu | 88 | 20 | 20 |
I hope I'm doing something wrong. :)
To Reproduce
docker-compose reproducer: https://github.com/cpbotha/pgcat-docker-compose-demo
From that readme:
Start the cluster:
docker-compose up
Try with pgbench
:
# https://www.postgresql.org/docs/current/pgbench.html
# init "postgres" database for pgbench
pgbench -p 6432 -h localhost -i -U postgres postgres
# baseline perf with 10 clients to the bare postgres
pgbench -p 5432 -h localhost -T 20 -C -c 10 -n -U postgres postgres
# 10 clients via pgcat
pgbench -p 6432 -h localhost -T 20 -C -c 10 -n -U postgres postgres
# 100 clients via pgcat
# bench with 100 clients for 20 seconds
# -C: new connection for each transaction
# -n: no vacuuming before running test
pgbench -p 6432 -h localhost -T 20 -C -c 100 -n -U postgres postgres
Expected behavior
Going via pgcat should not decrease TPS by 4x to 5x like it does on x86.
Not a maintainer here, but I'm wondering about whether you pulled the image or if you tried building it locally from the dockerfile? Differences between architecture may be fixable by tweaking the release profile and adding compilation flags or enabling/disabling jemalloc. Also, I assume you ran all the tests in docker? Since afaik docker on mac runs in its own vm.
I have a fairly powerful AMD server at work that we are currently in the middle of doing benchmarking on to evaluate different database configurations, and we are considering benchmarking pgcat as well. Do you see a similar dropoff at all pool sizes or just at 10?
it's pulling exactly the same image, please see the linked reproducer project.
what's interesting, is that that exact same config is giving pretty good perf on apple silicon where the x86 is being emulated (!!), with the perf on native x86 being super sad.
I would be really curious to hear about the outcomes of your benchmarks if you do decide to add pgcat to the mix. I did not experiment with different pool sizes yet, probably won't have time soon, have had to postpone the pgcat integration work due to this perf issue.