An equivalent of __context__, but for exceptions that got pre-empted by another exception unwinding the stack
Opened this issue · 0 comments
As @Nikratio points out in python-trio/pytest-trio#30, Trio's way of propagating exceptions can lead to obscure results when you have multiple communicating tasks where a crash in one of them triggers a crash in another. Depending on timing and the presence of checkpoints in the first task's cleanup clauses, you can end up in a situation where the second exception propagates up to a common nursery, triggers a cancellation of the original task, and this cancellation wipes out the original exception, leaving you scratching your head over the root cause.
Here's a simplified version of the original example:
import trio, trio.testing
async def echo_server(server_stream):
try:
async with server_stream:
data = await server_stream.receive_some(10)
await server_stream.send_lal(data) # <--- notice the typo
finally:
# Pretend we had some other cleanup to do
await trio.hazmat.checkpoint()
await trio.hazmat.checkpoint()
async def echo_client(client_stream):
await client_stream.send_all(b"x")
assert await client_stream.receive_some(1) == b"x"
async def main():
client_stream, server_stream = trio.testing.lockstep_stream_pair()
async with trio.open_nursery() as nursery:
nursery.start_soon(echo_server, server_stream)
nursery.start_soon(echo_client, client_stream)
trio.run(main)
The exact details might change depending on trio version. For me right now on 0.3.0, putting one checkpoint in the finally
block gives me a MultiError([AssertionError, AttributeError])
, and putting two checkpoints gives me just a plain AssertionError
– the AttributeError
has disappeared.
There doesn't seem to be any way to actually preserve the AttributeError
here – the echo_server
task caught it, and then got cancelled while handling it. In this case it would eventually have propagated out, but in general there's no way to know that. Maybe it was caught for real. In @Nikratio's example, it wasn't going to propagate further, but was going to get logged.
However, taking a page from Python 3's implicit exception chaining, we can at least preserve the information that the AssertionError
preempted the AttributeError
, so the information is available later when trying to figure out wtf happened. At least in principle.
One possible approach:
-
Implement #285, so that nurseries can peek at the
Cancelled
exceptions that were used to unwind other branches of the stack -
if any of these
Cancelled
exceptions have__context__
values, gather those up -
attach them to the exception that the nursery re-raises, in a new
__preempted__
attribute or similar (it's tempting to wedge this into__context__
instead of making something new, but I don't think we can really do that meaningfully) -
update our traceback printing code to check for
__preempted__
, and say something about it
One trick is how to record __preempted__
, given that we can have complicated situations like: the same exception passing upwards through multiple nurseries, and preempting some exceptions at each one. Or, a MultiError
that pre-empts some other exceptions, but then part of the MultiError
gets caught and it converts back into a regular single exception.
Idea: make __preempted__
a dict mapping frames to sets of preempted exceptions – with the idea that the frame records where during the unwinding the preemption took place. When we filter a MultiError
, preserve and combine the __preempted__
from MultiError
objects that get collapsed. When printing, make a note at the point in the stack where the preemption happened. Maybe the default is that we print a little note like "(at this point, preempted: RuntimeError, ValueError)" and give an envvar that can be set to get full details?
Regarding #285, it might make sense to apply this logic to TooSlowError
s too... maybe that'd just be clutter though, dunno.