Random disconnects
dridri opened this issue · 4 comments
Hi, I oftenly got this error :
/usr/share/lua/5.3/pgmoon/init.lua:600: attempt to index a nil value (field 'sock')
Stack trace :
in ??(self, t, data, len) at /usr/share/lua/5.3/pgmoon/init.lua:600
in q(self, q) at /usr/share/lua/5.3/pgmoon/init.lua:294
It seems that pgmoon randomly disconnects from the PSQL server, I'm running it standalone (without Resty). I tried dumb fixes such as setTimeout(-1), and trying to "ping" the database every 30seconds, but still not enough.
I found out that the socket is garbage collected under heavy load, hooking __gc
of pgresult.sock.sock
is enough to prevent it. So it seems to be related to luasocket, but not 100% sure.
I'm closing this as it appears to be a luasocket bug. I recommend using cqueues if possible as your network socket.
I found out that the socket is garbage collected under heavy load, hooking
__gc
ofpgresult.sock.sock
is enough to prevent it.
Interesting, seeing something that appears to be the same problem, using HAProxy Socket class... how does one hook __gc
, do you mean overload that function definition?
Initially pg connects fine and even waits for a notification, but then begins to error:
[kubernetes-ingress-controller] Init new socket of type:
[kubernetes-ingress-controller] haproxy
[kubernetes-ingress-controller] HAProxy class Socket table
[kubernetes-ingress-controller] LISTEN test
[kubernetes-ingress-controller] Checking for async postgres notification every 10000 ms
[kubernetes-ingress-controller]
[kubernetes-ingress-controller] Waiting for notifications...
[kubernetes-ingress-controller] nil
...
[kubernetes-ingress-controller] Waiting for notifications...
[kubernetes-ingress-controller] Lua task: runtime error: /usr/local/share/lua/5.3/pgmoon/init.lua:740: attempt to index a nil value (field 'sock') from /usr/local/share/lua/5.3/pgmoon/init.lua:740: in method 'receive_message', /usr/local/share/lua/5.3/pgmoon/init.lua:559: in method 'wait_for_notification', /usr/local/share/lua/5.3/auth.lua:176: in function line 150.
My code:
- Creates a new
pg
connection instance viapgmoon.new
- Queries
LISTEN test;
to listen for notifications (pgnotify) - Loops, calling
pg:wait_for_notification()
in each iteration of the loop
Here is my Lua code (runs in HAProxy):
function listen()
print(pg_settings)
local pg = pgmoon.new(pg_settings)
pg:settimeout(5000) -- Sets the timeout value (in milliseconds) for all socket operations (connect, write, receive)
pg.convert_null = true -- https://github.com/leafo/pgmoon#converting-nulls
local success, err = pg:connect()
if not success then
local msg = string.format(
'Error connecting (%s) host=%s db=%s user=%s',
err,
pg_settings['host'], pg_settings['database'], pg_settings['user']
);
core.log(core.err, msg)
reply_503(txn)
end
-- query database for auth info for the token
local interval = 10000
local channel = 'test'
local query = string.format("LISTEN %s", channel)
print(query)
pg:query(query)
print(string.format("Checking for async postgres notification every %s ms\n", interval))
while true do
print('Waiting for notifications...')
local n = pg:wait_for_notification() -- store result in some table
print(dump(n))
core.msleep(interval)
end
end
core.register_task(listen)
Anything obviously wrong about this?
My issue was (you can see in the code above) that my timeout was 5000
ms while the loop interval was 10000
ms so naturally the socket was closed after one iteration of the loop. Sorry for any noise/confusion!