Why connection.close() can raise an exception?
dolfinus opened this issue · 13 comments
I'm using AIOConnectionPool and try to check its behavior if LDAP server become unavailable.
Initial setup - I've started LDAP server on localhost, and try to search for a specific user:
Run local LDAP server
Using https://github.com/docker-ThoTeam/slapd-server-mock:
docker run --rm -v ./data.ldif.TEMPLATE:/bootstrap/data.ldif.TEMPLATE -p 389:389 thoteam/slapd-server-mock
>>> import logging
>>> from bonsai import LDAPClient, LDAPError, LDAPSearchScope
>>> from bonsai.asyncio import AIOConnectionPool
>>> log = logging.getLogger()
>>> client = LDAPClient("ldap://localhost")
>>> pool = AIOConnectionPool(client)
>>> async with pool.spawn() as connection:
>>> results = await connection.search(base="ou=people,dc=ldapmock,dc=local", filter_exp="uid=developer1", scope=LDAPSearchScope.SUBTREE)
>>> results
[{'dn': <LDAPDN uid=developer1,ou=people,dc=ldapmock,dc=local>, 'objectClass': ['inetOrgPerson', 'organizationalPerson', 'person', 'top'], 'cn': ['developer One'], 'sn': ['One'], 'givenName': ['developer'], 'mail': ['developer.one@ldapmock.local'], 'uid': ['developer1']}]
>>> connection
<bonsai.asyncio.aioconnection.AIOLDAPConnection object at 0x7f922fbd3770>
>>> connection.closed
False
Pool now has an opened connection. Stop LDAP server, and try to search for user again
try:
async with pool.spawn() as connection:
try:
results = await connection.search(base="ou=people,dc=ldapmock,dc=local", filter_exp="uid=developer1", scope=LDAPSearchScope.SUBTREE)
except LDAPError:
log.exception("Exception during search")
connection.close()
except LDAPError:
log.error("Exception during spawn")
raise
Calling connection.close()
allows to skip returning connection back to pool, so next access to the same pool will initialize a new connection: https://github.com/noirello/bonsai/blob/74ce8f8548a04ab763a21ffaf4bef3ef529ced18/docs/advanced.rst#connection-timeouts
But connection.close()
raises an exception, so connection is not being closed:
Exception during search
Traceback (most recent call last):
File "<console>", line 4, in <module>
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 59, in _poll
raise exc
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 54, in _poll
return await asyncio.wait_for(fut, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/tasks.py", line 452, in wait_for
return await fut
^^^^^^^^^
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 40, in _ready
res = super().get_result(msg_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
bonsai.errors.ConnectionError: Can't contact LDAP server. (0xFFFF [-1])
Exception during spawn
Traceback (most recent call last):
File "<console>", line 4, in <module>
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 59, in _poll
raise exc
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 54, in _poll
return await asyncio.wait_for(fut, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/tasks.py", line 452, in wait_for
return await fut
^^^^^^^^^
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 40, in _ready
res = super().get_result(msg_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
bonsai.errors.ConnectionError: Can't contact LDAP server. (0xFFFF [-1])
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "<console>", line 7, in <module>
bonsai.errors.ConnectionError: Can't contact LDAP server. (0xFFFF [-1])
>>> connection
<bonsai.asyncio.aioconnection.AIOLDAPConnection object at 0x7f922fbd3770>
>>> connection.closed
False
Because connection.close()
raised an exception, it looks like it haven't changed connection.closed
flag to True
. This connection is broken now, it cannot be used for making any requests.
For example, start LDAP server again and use the same pool. Because connection.closed
is not True
, it is still in the pool and being reused:
try:
async with pool.spawn() as connection:
try:
results = await connection.search(base="ou=people,dc=ldapmock,dc=local", filter_exp="uid=developer1", scope=LDAPSearchScope.SUBTREE)
except LDAPError:
log.exception("Exception during search")
connection.close()
except LDAPError:
log.error("Exception during spawn")
raise
Exception during search
Traceback (most recent call last):
File "<console>", line 4, in <module>
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 59, in _poll
raise exc
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 54, in _poll
return await asyncio.wait_for(fut, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/tasks.py", line 452, in wait_for
return await fut
^^^^^^^^^
File "/home/maxim/Repo/horizon/.venv/lib/python3.11/site-packages/bonsai/asyncio/aioconnection.py", line 40, in _ready
res = super().get_result(msg_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
bonsai.errors.LDAPError: Success. (0x0000 [0])
Well, something is definitely went wrong.
Fortunately, connection now become closed, so next requests made using the same pool working as expected:
>>> connection
<bonsai.asyncio.aioconnection.AIOLDAPConnection object at 0x7f922fbd3770>
>>> connection.closed
True
>>> try:
... async with pool.spawn() as connection:
... try:
... results = await connection.search(base="ou=people,dc=ldapmock,dc=local", filter_exp="uid=developer1", scope=LDAPSearchScope.SUBTREE)
... except LDAPError:
.. log.exception("Exception during search")
... connection.close()
... except LDAPError:
... log.error("Exception during spawn")
... raise
>>> results
[{'dn': <LDAPDN uid=developer1,ou=people,dc=ldapmock,dc=local>, 'objectClass': ['inetOrgPerson', 'organizationalPerson', 'person', 'top'], 'cn': ['developer One'], 'sn': ['One'], 'givenName': ['developer'], 'mail': ['developer.one@ldapmock.local'], 'uid': ['developer1']}]
>>> connection
<bonsai.asyncio.aioconnection.AIOLDAPConnection object at 0x7f922effc670>
>>> connection.closed
False
Few more things bother me:
-
connection.close()
is synchronous in both LDAPConnection and AIOLDAPConnection, but it looks like it try to send network request to LDAP server for properly unbind/disconnect instead of just dropping connection. Current implementation ofAIOConnectionPool.close()
is asynchronous, and it calls this synchronous method, so it can block event loop:
bonsai/src/bonsai/asyncio/aiopool.py
Line 82 in 74ce8f8
Line 115 in 74ce8f8
-
Current
Pool.close()
implementation callsconn.close()
for all connections using a cycle. So if some connection raises an exception during close, other connections in the pool may left unclosed. The only workaround I could find is:
for _ in range(pool.shared_connections):
with contextlib.suppress(LDAPError):
pool.close()
Which does not look right.
Proposed solution:
- Change
connection.close()
implementation to never raise any exceptions, if that even possible. - If
AIOLDAPConnection.close()
requires some request to be made to LDAP server (unbind, explicit connection close), make itasync
, and useawait
inAIOConnectionPool.close()
- If establishing the connection is failed with any
LDAPError
(not justConnectionError
, but alsoAuthenticationError
,InvalidDN
,TimeoutError
and so on), immediately mark connection as closed without requiring to callconnection.close()
explicitly. This connection cannot be used anymore. - Change
ConnectionPool.close()
implementation to pop connection from set, e.g.:
def close(self) -> None:
while self._idles:
with suppress(Exception):
conn = self._idles.pop()
conn.close()
while self._used:
with suppress(Exception):
conn = self._used.pop()
conn.close()
self._closed = True
To avoid partial cleanup of the pool if some connection cannot be closed.
Yes, the close method does more, than it probably should. Before the unbind, it sends out abandon requests for the unfinished operations. These are kind of a fire-and-forget requests, that's why the close method isn't awaitable. There aren't any message ids that can be polled.
One idea is to add an optional parameter to the close method that'd skip sending abandon requests.
If fire-and-forget means that event loop is not blocked, it's fine to leave this method synchronous.
But it should properly close connection in any case, without leaving it half-closed if such request caused an exception. Adding optional parameter is just sweeping the problem under the rug.
So?
This is the idea, that I was talking about: dev...optional-abandon
This way the close method wouldn't do extra abandon operations by default, and it has fewer chances to raise error.
Why not just remove this line?
bonsai/src/_bonsai/ldapconnection.c
Line 240 in 74ce8f8
And replace it with error/debug message
It seems a too drastic measure for me at that level, that could easily lead silent fails (simple messages can be overlooked).
I've changed the ConnectionPool.close()
almost as you suggested in your proposed solutions.
It seems a too drastic measure for me at that level, that could easily lead silent fails
Em, connection is getting closed, it can't be used anymore in any case.
I've checked that you can call close on a disconnected/stopped LDAP server without errors.
Also looked through the OpenLDAP source code, and it seems to me that ldap_unbind_ext
always returns LDAP_SUCCESS
. (There's some client control freeing part that I didn't check, but doesn't matter here, there's no control set).
But I still think it's good to keep the exception setting in place. The unbind implementation could change anytime, and the function signature suggests that it could return with an error.
IMHO the problem here in this case can be boiled down to two things:
- The
LDAPConnection.close
method originally did more, than it was supposed to do by default: calling abandon on on-going requests. This is what's likely caused the raisedLDAPError
exception, and this is addressed on the branch. - There's no proper way currently to identify broken connection in the pool. Calling close on connection is a suboptimal solution that's come up here, when this issue was first raised.
OK, I've tried to reproduce the error on this branch, using issue description, and it looks like connection is closed properly:
Open connection pool, then stop LDAP server and call:
>>> try:
... async with pool.spawn() as connection:
... try:
... results = await connection.search(base="ou=people,dc=ldapmock,dc=local", filter_exp="uid=developer1", scope=LDAPSearchScope.SUBTREE)
... except LDAPError:
... log.exception("Exception during search")
... connection.close()
... except LDAPError:
... log.error("Exception during spawn")
... raise
...
Exception during search
Traceback (most recent call last):
File "<console>", line 4, in <module>
File "/home/maxim/Repo/bonsai/venv/lib/python3.12/site-packages/bonsai/asyncio/aioconnection.py", line 59, in _poll
raise exc
File "/home/maxim/Repo/bonsai/venv/lib/python3.12/site-packages/bonsai/asyncio/aioconnection.py", line 54, in _poll
return await asyncio.wait_for(fut, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/maxim/.pyenv/versions/3.12.0/lib/python3.12/asyncio/tasks.py", line 510, in wait_for
return await fut
^^^^^^^^^
File "/home/maxim/Repo/bonsai/venv/lib/python3.12/site-packages/bonsai/asyncio/aioconnection.py", line 40, in _ready
res = super().get_result(msg_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
bonsai.errors.ConnectionError: Can't contact LDAP server. (0xFFFF [-1])
>>> connection.closed
True
Thank you!
The LDAPConnection.close method originally did more, than it was supposed to do by default: calling abandon on on-going requests. This is what's likely caused the raised LDAPError exception, and this is addressed on the branch.
One more thing here - according to https://linux.die.net/man/3/ldap_unbind_ext (Unbinding
section):
The ldap_unbind_s() call is just another name for ldap_unbind(); both of these calls are synchronous in nature.
Is it syncronous
in terms of asyncio as well? I mean does this line block the event loop, and prevents other coroutines to be started while calling await connection.close()
?
bonsai/src/bonsai/asyncio/aiopool.py
Lines 83 to 84 in 74ce8f8
There's no proper way currently to identify broken connection in the pool.
If some ldap-related function call returned LDAP_SERVER_DOWN
because LDAP server is not available, and then LDAP server is ready again, is it possible to use the same opened connection to the same LDAP server? Does returning LDAP_SERVER_DOWN
leads to self->socketpair
or ssock
in state which cannot be used anymore?
One more thing here - according to https://linux.die.net/man/3/ldap_unbind_ext (Unbinding section):
The ldap_unbind_s() call is just another name for ldap_unbind(); both of these calls are synchronous in nature.
The ldap_unbind_ext
does call a write system command before closing the socket. So yes, if there's not enough buffer in the kernel space that method could block. Setting bonsai.set_connect_async(True)
might help, but that could lead other problems when TLS is used. The async part of the libldap
library's interface works well with operations (add, delete, search etc.), but when it comes to connection handling, it isn't exactly the best.
If some ldap-related function call returned LDAP_SERVER_DOWN because LDAP server is not available, and then LDAP server is ready again, is it possible to use the same opened connection to the same LDAP server
It seems it doesn't possible to reuse the connection.
Ok, what's next? What do you think about merging optional-abandon
branch to dev
/master
?
I merged the branch to dev
.
Thanks!