wulczer/txpostgres

Add some kind of self-healing to the pool class

Closed this issue · 3 comments

Currently, when a postgres server is restarted, any application using txpostgres's ConnectionPool with it has either to detect it (and replace the connection on its own) or be restarted manually.

I believe the handling of such errors should be the (transparent) task of the pool. I thought about it poking a bit in the source code and came up with three options, sorted by decreasing complexity (and convenience):

Write a Pool that re-connects and re-queries if a connection got lost

There would be a lot of refactoring necessary in order to have all the information it needs to retry (runOperation or runQuery and the query itself). Auto-Retry is also kind of dangerous when I think about it.

Replace all connections in the pool once a connection error occurs

A bit tricky to achieve, as connections that aren't in the pool when the first error is triggered might cause reconnections all over again.

A solution might be tagging the connections with a "generation tag" (a simple increasing int would be enough) that's also saved as the current tag for the pool. Once a connection is lost, connections with older tags would be reconnected, tagged with the new tag and a specific "ConnectionLost" exception would be raised. If the tag of the lost connection shows that the connection belongs to an older generation, it would just refresh itself – leaving the pool alone.

The application would still have to handle disconnect errors, but only once and wouldn't have to reconnect on it's own. This would be my favorite solution I think.

Don't re-add broken connections

And finally the both most simple and robust solution (that in fact could be added independent of the above): If _putBackAndPassthrough() encounters a connections error, it doesn't add it back into the pool but creates a new instead.

The drawback would be, that it would take min errors (ie the number of connections in the pool), till the pool is healthy again. I'd add this as a simple boolean kwargs to the ConnectionPool.

I hope it's halfway understandable what I mean. :) Let me know what you think!

I'm also experiencing issues with this. It would be nice if ConnectionPool did the reconnecting automatically and transparently. However, what is the simplest workaround at the moment? Catch and detect the DatabaseError that signifies a lost connection and attempt to re-create the ConnectionPool?

Please see this comment about the automatic reconnection code I just pushed: #23 (comment)

Cleaning up older issues, I'm going to close this one, since reconnection is now fairly easy to implement in user code.