Connection pool will not honor maximum connections if all connections in use

Question

Connection pool will not honor maximum connections if all connections in use

cjhenck opened this issue 9 years ago · 8 comments

It seems to me that the connection pool will not honor the maximum connections limit, because every time a connection is put back into the queue, the count is decremented. So take the following scenario:

max_connections is 5
Queue is empty five calls in a row: _created_connections.current() is now 5
All five connections are put back into the pool: _created_connections.current() is now 0
All five connections are taken out of the queue: _created_connections.current() is now 0
Five new connections can be pulled from the pool.

    def get(self):
        try:
            return self.q.get_nowait()
        except Empty:
            if self._created_connections.current() < self.max_connections:
                conn = self.connection_class(**self.connection_kwargs).conn
                self._created_connections.incr()
                return conn
            raise

    def put(self, connection):
        self.q.put(connection)
        self._created_connections.decr()

Answer 1 · 2015-06-09T19:07:13.000Z

I guess you are right. I only had in mind the process of creating connections until reaching the upper limit, did not think about this aspect. You are welcome to submit a pull request for this issue. Thanks for spotting this one!

Answer 2 · 2015-06-10T14:05:23.000Z

I may do that - I fixed the bug on our repo, but also added an option to provide multiple connection arguments to do a simple round-robin against all of the cluster endpoints and to add some level of failure-recovery if one of the endpoints is down. It needs test cases, and I'm not sure if you want all that functionality in there as well.

Answer 3 · 2015-06-10T15:17:40.000Z

I'd rather not include the multiple connection feature, as I believe remodel should be a simple library and this will deviate a bit from my initial concept. Just as curiosity, why do you need extra connections for cluster endpoints? From my understanding, RethinkDB clusters work as expected by connecting to a single endpoint and running the queries against it.

Nonetheless, it would be great to include your fix in the next remodel release, together with a handful of tests!

Answer 4 · 2015-06-11T14:19:40.000Z

My thinking was twofold:

If our connection node dies, we want to try to fail-over to another node
Basic load-balancing as per here: rethinkdb/rethinkdb#281 (comment)

That said, there are a number of alternatives:

Actual fail-safe load balancer
Fetch addresses of other nodes and use them as a backup

I think for the purposes of the library, a load balancer probably would fit the bill philosophically.

I'll go back to my patch and make a few changes/tests.

Answer 5 · 2015-06-11T19:12:55.000Z

I didn't invest that much effort in the connection pool, but I see some very valid points raised by you and in the issue you've linked to. If you put together a useful and working snippet, I will be more than happy to integrate it!

Answer 6 · 2015-06-23T14:16:20.000Z

Cool. I'll try to finish it up this week or early next!

Answer 7 · 2015-07-22T20:38:06.000Z

I was able to do more investigation of this as well. It seems like RethinkDB supports running a local instance in cluster proxying mode, which I think is a better solution than adding a round robin. I still need to get to the patch.

Answer 8 · 2019-06-08T15:54:09.000Z

Closing this one for now. Feel free to reopen the issue if you managed to find time for this!