UserParameter for streaming replication byte lag can produce negative values
jsh-hk opened this issue · 1 comments
The user parameter to calculate byte lag can produce negative values, causing the item to fall into "Not Supported" status in Zabbix.
These negative values are due to sent_location being lower than replay_location, making it appear that the slave is "ahead" of the master. http://pgsql.privatepaste.com/db98e7efb8 . I ran into this on my environment and determined that this was caused by sent_location being lower than the master's pg_current_xlog_location, concluding that the negative values are not a real indicator of streaming replication byte lag.
Fix is to simply have the byte lag UserParameter return 0 in the event of negative values:
UserParameter=pgsql.streaming.lag.bytes[*],psql -qAtX $1 -c "select GREATEST(0,pg_xlog_location_diff(sent_location, replay_location)) AS pg_xlog_location_diff from pg_stat_replication where client_addr = '$2'"
P.S. Thank you for sharing this template, it's by far the best solution for monitoring PostgreSQL with Zabbix IMO
Yes, I saw this behaviour in some rare cases. Fixed for pgsql.streaming.lag.bytes and pgsql.streaming.lag.seconds. Thanks!