fastly/pushpin

Memory increasing when clients connect but doesn't free up when they disconnect

jcox250 opened this issue · 5 comments

I'm seeing something strange with the memory in a pushpin container I have running locally and I'm just trying to understand if it's a bug, intentional behaviour or there's some config I haven't configured properly that's causing it to happen.

I have a simple Go server with a handler (/stream endpoint) that returns the headers from the docs that are required to implement HTTP Streaming. In front of this server I run a pushpin proxy in a docker container using the fanout/pushpin image. I then have a small Go program that can spin up N clients that make requests to the /stream and wait for events.

However when I run the program to connect N clients I see that the memory in the pushpin container increases, I assume some sort of buffer is being allocated for the connected clients? But then when these clients disconnect it looks like that memory isn't being free'd up. I've left the container running for a good few minutes and the memory usage doesn't ever seem to decrease but I've included a short video demonstrating what I'm seeing.

pushpin memory recording.zip

This is the version of the pushpin docker image that I'm using

$ docker images | grep fanout/pushpin    
fanout/pushpin                                                    latest                          f43025c29c42   3 months ago    241MB

Here's a zip with my scripts, docker-compose file and pushpin.conf. To reproduce you can do the following

// Build the server and client binaries
make build 

Bring up pushpin container
// make dev

// Run Go Server
./server

// Connect clients
./clients --client 1000

puspin-server.zip

I'll try to look into this soon, but a quick answer is that some collections objects may not shrink their backing memory after items are removed from them, and we consider a certain amount of this to be acceptable. I suggest running the test multiple times and seeing if the memory increases begin to plateau.

You might also check the memory growth of individual processes (pushpin has a few), to see where it is occurring.

I'll try to look into this soon, but a quick answer is that some collections objects may not shrink their backing memory after items are removed from them, and we consider a certain amount of this to be acceptable. I suggest running the test multiple times and seeing if the memory increases begin to plateau.

This is something I had noticed, that after a certain number of connect/discconect/reconnect cycles the memory usage did plateau

You might also check the memory growth of individual processes (pushpin has a few), to see where it is occurring.

Sorry if this is the in docs somewhere and I've missed it but how would I go about doing this?

You might also check the memory growth of individual processes (pushpin has a few), to see where it is occurring.

One way is to install/run top within the container.

It looks like the condure process seems to be using the most memory, is that to be expected based on this?

but a quick answer is that some collections objects may not shrink their backing memory after items are removed from them, and we consider a certain amount of this to be acceptable.

Running top after the container started up

containerStart

Top after I repeated my tests until the memory plateaued

memoryPlateau

That's a little surprising, since condure explicitly frees memory when connections go away. It does pre-allocate a bunch of stuff though. A possible explanation is that the underlying memory for these pre-allocations doesn't actually get allocated by the kernel until the memory is first accessed, causing a jump in memory use once the process gets warmed up a bit.

These memory numbers can be a bit mysterious. If repeated tests of the same traffic level don't cause memory to continually increase, then there probably isn't a real leak.