OptimalBits/redbird

Memory leak / WebSockets?

endel opened this issue ยท 6 comments

endel commented

Hi @manast, thank you so much for providing this proxy. I'm running it on production for a couple of weeks, and it's working really well, I only seem to be having a memory leak issue.

I'm proxying a lot of WebSocket connections (from Colyseus) and I'm seeing an increase in memory that doesn't seem to stop (until the process gets restarted). Do you have any ideas about where the leakage can come from? I'd like to inspect this, and if I succeed I can send you a PR.

Thanks a lot!

I really don't know, AFAIK there are no memory leaks in redbird, we keep it running for many months on high traffic sites without any memory issues.

endel commented

Hi @manast, I was considering that the issue could be on PM2, but it really looks like it's on redbird.

My Colyseus processes never grow bigger than 200mb memory usage, and the proxy keeps eating more and more memory. Again, I could not reproduce locally, this happens only in production ๐Ÿ˜ญ

On this screenshot, you can see redbird eating 37% of memory, and my other process using only 13%
Screenshot 2020-01-27 at 11 19 06

If you happen to have any advice, I'd appreciate! Cheers!

If it is a memory leak it should be possible to reproduce locally, if it is the same exact code...

endel commented

I've spent most of the day inspecting this issue yesterday, and I found interesting that some connections were not being removed from the proxy (http-proxy)

I've put this piece of code to check the number of connections on both my regular server and the proxy server:

server.getConnections((err, numConnections) => {
  console.log("Proxy connections:", numConnections);
});

And after stress-testing it, and forcibly dropping all connections from the user end, some of them were not being removed from the proxy server:

0|colyseus-app  | Server connections: 0
1|proxy         | Proxy connections: 312

After some digging, I've found that http-proxy uses socket.setTimeout(0), so I've tried setting a timeout for the sockets:

socket.on('timeout', () => {
  socket.end();
  socket.destroy();
});
socket.setTimeout(5000);

Then the idle connections were gone:

0|colyseus-app  | Server connections: 0
1|proxy         | Proxy connections: 0

I wonder if I'm doing something wrong on my own application's end that is not properly closing the connection in the proxy, or if it's really an issue in the proxy. Not sure if you can help me, just wanted to share my thought process here. If you can give me any thoughts I'd really appreciate it. Thanks a lot!

Hi @endel,
We're having same issues with redbird. The process is killed by OOM killer. We experienced only a single failure when the traffic was low. But now we're seeing that behaviour quite a few times as the traffic has increased. How did you resolve this issue?

Any pointers would be really helpful

Thanks

endel commented

Hi @rahulwinzo, unfortunately to this day I still could not fix the issue. My proxy restarts every week, mostly. (I'm now using plain node-http-proxy, but it has no difference from redbird on this matter)

Also, I've failed to reproduce locally. Indeed the problem lays on node-http-proxy (which redbird uses under-the-hood).

I'd really appreciate if you could give a shot trying to reproduce locally as well.