gleu/pgstats

Connection locks not valid

MichaelDBA opened this issue · 6 comments

When using the -s connection parameter, it shows almost all idle connections as having a lockwaiting event. That event field is not valid for current activity if state is idle. I am using PG V10 for this example with 1.0.1 pgstat version.

- total - active - lockwaiting - idle in transaction - idle -
     188        1           187                     0    180
     datname     |  pid  |    usename     |     wait_event      |  we_type  | state  |   backend_type    |    backend_start    |     query_start     | conn_secs | idle_secs | q_secs |           query
 ----------------+-------+----------------+---------------------+-----------+--------+-------------------+---------------------+---------------------+-----------+-----------+--------+---------------------------
     mydb        | 22521 |  myusername    |     ClientRead      | Client    | idle   | client backend    | 2019-04-05 11:07:15 | 2019-04-05 12:07:45 |      3631 |         1 |     -1 | select * from "player" wh


gleu commented

As far as I can tell, pgstat already does that:

For those in waiting:

sum(CASE WHEN backend_type='client backend' and state='active' and wait_event IS NOT NULL
         THEN 1 ELSE 0 end) AS lockwaiting, 

For those not waiting:

sum(CASE WHEN backend_type='client backend' and state='active' and wait_event IS NULL
         THEN 1 ELSE 0 end) AS active,
gleu commented

I just pushed a small fix to the sum of total number of connections in v10+, and did some refactoring on the queries.

I tried running it again after downloading and compiling via git clone and it still showed version 1.0.1. And of course it shows the same problem in V10. Look a the numbers below and you have to acknowledge a problem since the total number of connections is 200 but the sum of lockwaiting plus idle is almost doubled!

pgstat -h ec2-34-230-142-223.compute-1.amazonaws.com -U u4bvma41uca1bq -d db9sp1t366u3k5 -p 5432 -s connection


- total - active - lockwaiting - idle in transaction - idle -
    200        1           199                     0    192
    200        1           199                     0    192
    200        1           199                     0    191
    200        1           199                     0    191
    200        1           199                     0    191
    200        1           199                     0    192
    200        1           199                     0    192
    200        1           199                     0    192
    200        1           199                     0    192
    200        1           199                     0    191
    200        1           199                     0    192

When I run the queries from pgstat.c i get the right information.

-- 10.0
SELECT count(*) AS total, sum(CASE WHEN state='active' AND wait_event IS NULL THEN 1 ELSE 0 END) AS active, sum(CASE WHEN state='active' AND wait_event IS NOT NULL THEN 1 ELSE 0 END) AS lockwaiting,  sum(CASE WHEN state='idle in transaction' THEN 1 ELSE 0 END) AS idleintransaction, sum(CASE WHEN state='idle' THEN 1 ELSE 0 END) AS idle FROM pg_stat_activity WHERE backend_type='client backend';

9.6
SELECT count(*) AS total, sum(CASE WHEN state='active' AND wait_event IS NULL THEN 1 ELSE 0 END) AS active, sum(CASE WHEN state='active' AND wait_event IS NOT NULL THEN 1 ELSE 0 END) AS lockwaiting, sum(CASE WHEN state='idle in transaction' THEN 1 ELSE 0 END) AS idleintransaction, sum(CASE WHEN state='idle' THEN 1 ELSE 0 END) AS idle FROM pg_stat_activity;

gleu commented

So, if I understand you well, the queries send the right informations, but pgstat doesn't? It doesn't make sense. pgstat just displays the result. It doesn't transform it in any way.

I completely acknowledge that what you show is wrong. You shouldn't have such numbers. But I honestly don't know how you get them. All tests I've done are good, and don't show any issue.

Fixed. I had 2 versions of pgstat: one compiled with PG v10 and one compiled with PG v11. I was using the PG 10 version when I was getting the display errors, but now that I am using the PG v11 version of pgstat, it displays correctly. I can't explain why it didn't do it correctly even with the PG v10 version. I ran this against a PG 10 cluster.

In the pgstat directory where I downloaded it:

./pgstat -V

--> pgstats 1.0.1 (compiled with PostgreSQL 11.2 (Ubuntu 11.2-1.pgdg16.04+1))

In /usr/bin:

pgstat -V

--> pgstats 1.0.1 (compiled with PostgreSQL 10.5 (Ubuntu 10.5-2~30.git84cd21f.pgdg16.04+1))

 - total - active - lockwaiting - idle in transaction - idle -
     184        1             1                     0    182
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             0                     0    183
     184        1             1                     0    182