lesovsky/zabbix-extensions

UserParameter for streaming replication byte lag can produce negative values

jsh-hk opened this issue · 1 comments

The user parameter to calculate byte lag can produce negative values, causing the item to fall into "Not Supported" status in Zabbix.

These negative values are due to sent_location being lower than replay_location, making it appear that the slave is "ahead" of the master. http://pgsql.privatepaste.com/db98e7efb8 . I ran into this on my environment and determined that this was caused by sent_location being lower than the master's pg_current_xlog_location, concluding that the negative values are not a real indicator of streaming replication byte lag.

Fix is to simply have the byte lag UserParameter return 0 in the event of negative values:

UserParameter=pgsql.streaming.lag.bytes[*],psql -qAtX $1 -c "select GREATEST(0,pg_xlog_location_diff(sent_location, replay_location)) AS pg_xlog_location_diff from pg_stat_replication where client_addr = '$2'"

P.S. Thank you for sharing this template, it's by far the best solution for monitoring PostgreSQL with Zabbix IMO

Yes, I saw this behaviour in some rare cases. Fixed for pgsql.streaming.lag.bytes and pgsql.streaming.lag.seconds. Thanks!