(aioredis v2) "Bad file descriptor" / "Event loop is closed" when shutting down event loop
egnor opened this issue · 13 comments
I have a simple app which
- launches a main with
asyncio.run(main())
- in that main, creates a client with
redis = aioredis.from_url(...)
(using SSL) - does some things with the client
- shuts down the client with
await redis.close()
- returns from the main, at which point
asyncio.run
stops the event loop
This causes the following exception to be logged as WARNING (not thrown!) on exit:
⚠️ asyncio: Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7fa348eb8790>
transport: <_SelectorSocketTransport closing fd=6>
Traceback (most recent call last):
File "/usr/lib/python3.9/asyncio/selector_events.py", line 918, in write
n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.9/asyncio/sslproto.py", line 684, in _process_write_backlog
self._transport.write(chunk)
File "/usr/lib/python3.9/asyncio/selector_events.py", line 924, in write
self._fatal_error(exc, 'Fatal write error on socket transport')
File "/usr/lib/python3.9/asyncio/selector_events.py", line 719, in _fatal_error
self._force_close(exc)
File "/usr/lib/python3.9/asyncio/selector_events.py", line 731, in _force_close
self._loop.call_soon(self._call_connection_lost, exc)
File "/usr/lib/python3.9/asyncio/base_events.py", line 746, in call_soon
self._check_closed()
File "/usr/lib/python3.9/asyncio/base_events.py", line 510, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Note that while you may consider the extra log a non-issue, Bad file descriptor
errors are always a bit scary to me, since there's always the possibility that fd could have been reused elsewhere in the process, and then random data would be written to a random fd, causing arbitrary corruption and bad data.
As for the root cause, there is some complexity here and you will find a number of threads online about SSL connection shutdown and event loop shutdown. The consensus seems to be that, at present at least, all connections need to be fully closed (with wait_closed()
awaited for) before the event loop is shut down. This is arguably a bug depending on who you ask, but it is the current state of asyncio in any case.
In aioredis, Connection.disconnect
does in fact do that. However, Redis.close
simply returns its active connection to its connection pool and does not disconnect it. The connection pool itself is left alone.
This makes sense because the connection pool may be shared with other Redis
instances and other users. However, in this case, the connection pool was created in the Redis
constructor, so there's no opportunity to disconnect it. If I add a direct call like this:
await redis.close() # return connection to pool
await redis.connection_pool.disconnect() # shut down all connections in pool
Then, the error doesn't happen. But it feels like the connection_pool
is an internal detail that Redis
should be managing, if it was created internally and not passed in?
Haven't taken much time to look around our source code for validation and it's the night so I'll let others give input. For now, my response: In Django, there is a call done for connections where they explicitly close connections for the developer. Whatever the service you're using may not have what Django does. Maybe you do need to explicitly call the disconnect yourself, but I'm not entirely sure: https://github.com/django/django/blob/f331eba6d576752dd79c4b37c41d981daa537fe6/django/core/cache/__init__.py#L59-L66
Hope that somewhat answers it unless I'm totally brain dead at the moment.
@Andrew-Chen-Wang I'm not using a "service" at all, this is a simple command line tool.
@abrookins I'm working on making a standalone example now!
Here's an example that demos the issue. You'll need to set $REDIS_URL
to a database of some kind; this will scan and dump all the keys and values, so probably don't use a big database, or modify the code to add a match expression, or something. Uncomment the line after the bug ref to show my workaround.
The redis instance I'm using is quite some distance away (on Heroku), it's possible the higher network latency is part of triggering this issue. If you can't repro it with your own test databases, I can give one of you credentials to my database instance (it's not all that sensitive, just a hobby project).
[note: edited to remove some value-printing code that would fail on an empty database]
#!/usr/bin/env python3
import asyncio
import os
import sys
import aioredis
async def main() -> None:
redis_url = os.environ.get("REDIS_TLS_URL") or os.environ["REDIS_URL"]
redis = aioredis.from_url(redis_url, ssl_cert_reqs=None)
cursor = None
while cursor != 0:
cursor, keys = await redis.scan(cursor=cursor or 0)
for key in keys:
print(f'key: {key.decode()}')
await redis.close()
# https://github.com/aio-libs/aioredis-py/issues/1078
# await redis.connection_pool.disconnect()
sys.exit(asyncio.run(main()))
@egnor I tried running your example with local Redis running (fyi had to remove ssl_cert_reqs=None
because I wasn't using SSL with your Heroku instance). First thing that happened was keys returned an empty list which resulted in redis.mget
raising a TypeError saying it required one positional argument (since keys was empty, there were no value to pass). I did check that cursor was an int 0
.
Maybe wrap your for loop zip
with a try except TypeError.
However your original comment:
SSL connection shutdown and event loop shutdown. The consensus seems to be that, at present at least, all connections need to be fully closed (with wait_closed() awaited for)
is something that may need looking into here.
Ah, empty database I guess? Anyway I removed the value-fetching part entirely (editing the post above), that's not important.
Also, I suspect this requires an SSL instance to repro, since it's the SSL stack in particular that writes data as part of a closing handshake.
Interesting issue. I'm not able to reproduce with an SSL connection to a not-super-close Redis instance, myself, running your example code on Python 3.9.5. What version of Python do you have?
If you want to send me credentials, you can email me at andrew.brookins@redislabs.com, or you can share them with me via 1Password or really any way you choose!
@egnor Also, what version of Python are you using?
% python --version
Python 3.9.5
I just verified it still repro's. Will send credentials. Thanks for looking into it!
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
可以替代asyncio.run(main())
解决Event loop is closed
@egnor please let me know if this is still happening on master branch. if so, do you mind telling me the exact number of seconds this error takes to occur from start of main
to end?
Sorry for going AWOL! This project ended and I haven't worked with the codebase in a while. So unless you have other people with this problem, we can just close this unless it reappears somewhere, or someone else has this problem?
@LWS-CSDN I am using Google Translate to understand you, but I think you're suggesting to replace asyncio.run() with loop.run_until_complete() to avoid this issue. I think that's mostly brushing it under the rug by avoiding destruction of the event loop...? But does that mean you encountered this issue?
@egnor thanks for the speedy reply! I'm tempted to close this, but redis-py is introducing further SSL tests, so I'll check to see if this issue could be potentially resolved via their testing and any introduction of new code.
Works great for in my case! Thank YOU! @LWS-CSDN
Fix log
Exception
[2022-07-07 11:11:11,111: ERROR/ForkPoolWorker-2] Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x123>
transport: <_SelectorSocketTransport closing fd=135>
Traceback (most recent call last):
File "/usr/lib/python3.9/asyncio/selector_events.py", line 918, in write
n = self._sock.send(data)
File "/home/user/project/venv/lib/python3.9/site-packages/gevent/_socketcommon.py", line 722, in send
return self._sock.send(data, flags)
OSError: [Errno 9] Bad file descriptorDuring handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.9/asyncio/sslproto.py", line 684, in _process_write_backlog
self._transport.write(chunk)
File "/usr/lib/python3.9/asyncio/selector_events.py", line 924, in write
self._fatal_error(exc, 'Fatal write error on socket transport')
File "/usr/lib/python3.9/asyncio/selector_events.py", line 719, in _fatal_error
self._force_close(exc)
File "/usr/lib/python3.9/asyncio/selector_events.py", line 731, in _force_close
self._loop.call_soon(self._call_connection_lost, exc)
File "/usr/lib/python3.9/asyncio/base_events.py", line 746, in call_soon
self._check_closed()
File "/usr/lib/python3.9/asyncio/base_events.py", line 510, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Fix
Use
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
instead of
asyncio.run(main())
some tags for search engines:
aiogram, asyncio, celery, django