Problem with very short timeouts, including zero
Opened this issue · 0 comments
I have a tool for which it makes sense to allow a timeout to be specified on the command line. It also makes sense for a user to try a timeout of zero to see what the timeout failure looks like. But this doesn't work: despite not setting swallow_exc
to False, the TimeoutException
gets raised. Digging into threading.Timer
a bit, this appears to be the result of a 'feature', in that a timeout of 0 is treated specially, causing the stop
function to be called from inside the call to Timer.start
.
This in turn appears to trigger a race condition wrt the ___exit__
clause on BaseTimeout
, based on evidence from the following experiment.
The attached code is a trivial use of stopit, with timeout supplied from the command line.
TL;DR
Illustrates the problem, reproduced on various systems. Also discovers a related problem on Windows 10, for very small values of the timeout
How to reproduce
Here are four example input/output pairs, two good to start, then two that illustrate the two problems:
Runs w/o timing out
>: while :; do python3 test_stopit.py 1; done
done 0
done 0
done 0
done 0
done 0
[repeats]
Times out
>: while :; do python3 test_stopit.py 0.01; done
done 2
done 2
done 2
done 2
done 2
[repeats]
Timeout of 0: always raises TimeoutException
>: while :; do python3 test_stopit.py 0; done
Traceback (most recent call last):
File "test_stopit.py", line 3, in <module>
with stopit.ThreadingTimeout(float(sys.argv[1])) as to_ctx_mgr:
File "/usr/local/lib/python3.8/site-packages/stopit/utils.py", line 73, in __enter__
self.setup_interrupt()
File "/usr/local/lib/python3.8/site-packages/stopit/threadstop.py", line 60, in setup_interrupt
self.timer.start()
File "/usr/lib/python3.8/threading.py", line 857, in start
self._started.wait()
File "/usr/lib/python3.8/threading.py", line 558, in wait
signaled = self._cond.wait(timeout)
File "/usr/lib/python3.8/threading.py", line 302, in wait
waiter.acquire()
stopit.utils.TimeoutException
[repeats]
Very small timeout, two different results
>: while :; do python3 test_stopit.py 0.0000000001; done
done 2
done 2
[repeats 5 times]
done 2
Traceback (most recent call last):
File "test_stopit.py", line 3, in <module>
with stopit.ThreadingTimeout(float(sys.argv[1])) as to_ctx_mgr:
File "/usr/local/lib/python3.8/site-packages/stopit/utils.py", line 73, in __enter__
self.setup_interrupt()
File "/usr/local/lib/python3.8/site-packages/stopit/threadstop.py", line 60, in setup_interrupt
self.timer.start()
File "/usr/lib/python3.8/threading.py", line 857, in start
self._started.wait()
File "/usr/lib/python3.8/threading.py", line 558, in wait
signaled = self._cond.wait(timeout)
File "/usr/lib/python3.8/threading.py", line 302, in wait
waiter.acquire()
stopit.utils.TimeoutException
done 2
done 2
Traceback (most recent call last):
[as above]
stopit.utils.TimeoutException
Traceback (most recent call last):
[as above]
stopit.utils.TimeoutException
done 2
[etc., _ad nauseum]
The above were run under Cygwin, Python 3.8.10.
Under Ubuntu focal, Python 3.8.10, the zero case fails as above, but the very small timeout always successfully gives done 2
.
Likewise with Debian buster, Python 3.7.3 and several other Linux distros.
Under Windows 10, Python 3.8.10 however, the zero case sometimes fails as above (TimeoutException
under waitout.acquire
), and sometimes trips the assertion error!
AssertionError: state is 2, not 1
For the very small timeout value, we get some successes, i.e. 'done 2', and also some failures, but the failure surfaces at a different point:
>python test_stopit.py 0.0000000001
done 0
Traceback (most recent call last):
File "D:\work\jar\doi\pdf1\test_stopit.py", line 9, in <module>
print('done',to_ctx_mgr.state)
stopit.utils.TimeoutException
Conclusions
It would be nice to
- Handle a timeout of zero in
stopit
itself, or at the very least include a warning in the documentation; - Debug the race condition on Windows/Cygwin (I'm assuming the underlying problem is the same), but this probably needs to involve someone familiar with the internals of the threading module...
Test code
import stopit, sys
with stopit.ThreadingTimeout(float(sys.argv[1])) as to_ctx_mgr:
assert to_ctx_mgr.state == to_ctx_mgr.EXECUTING, \
"state is %s, not %s"%(to_ctx_mgr.state, to_ctx_mgr.EXECUTING)
for i in range(1000000):
if i>0:
pass
print('done',to_ctx_mgr.state)