Kattis/problemtools

TLE submissions don't time out when running verifyproblem

Closed this issue · 7 comments

I noticed that verifyproblem keeps going with TLE submissions until they completely finish a single test case over the safety margin, instead of previous behaviour where it would stop at or soon after the safety limit. Once it finds a single test case that times out, it doesn't keep going with other test cases, but it seems to insist on completely finishing one submission rather than stopping the CPU after a while.

This sounds like correct behavior to me. The time limit (and safety margin) is per test case, not summed over all test cases. If you still think there is an error, please post a minimal example.

I noticed this on a problem for an upcoming contest so I shouldn't post identifying details here but I'll post the essentials for now. You recently ran the verification log for a problem which included the following two lines

Slowest AC runtime: 0.058, setting timelim to 1 secs, safety margin to 2 secs
TLE submission [redacted].py (Python 3 (w/pypy)) OK: TLE [test case: test case secret/26, CPU: 3.00s @ test case secret/26]

Running verifyproblem on my end, those two lines say

Slowest AC runtime: 0.078, setting timelim to 1 secs, safety margin to 2 secs
TLE submission [redacted].py (Python 3 w/Pypy) OK: TLE [test case: test case secret/26, CPU: 5.69s @ test case secret/26]

In particular, I have the same time limit and safety margin as you, but on your end the CPU stops at 3.00s (I think the expected behaviour) and on my end it keeps going for 5.69s for the same test case. For other problems with much longer TLE submissions (e.g. exponential running time), verifyproblem just gets stuck on Running until I forcibly quit it.

How are you running verifyproblem? Inside what environment(s)?

I'm running it from the command line in Ubuntu 20.04.3 using WSL2 on my Windows 10 laptop

I don't have a windows box to test with. But I wonder if it's related to the fact that setrlimit does not appear to work well on WSL: microsoft/WSL#4509

verifyproblem uses setrlimit to put limits on child process running time.

You might try hacking the source of verifyproblem (see https://github.com/Kattis/problemtools/blob/master/problemtools/run/limit.py) to print out what limit is attempted to be set and what getrlimit actually reports.

Interestingly, if I set a hard limit of 1 second (or even 0 seconds) for CPU time, I do get the warning from verifyproblem that "runs involving higher CPU limits than this may behave incorrectly", but I get the same behaviour with ~6s runtimes using the same example as above. I'll play around with this a bit more to see if I can figure out the cause. I'll note though that before I updated problemtools I wasn't seeing this issue in WSL2.

Sorry, turns out I was indeed using WSL1 and not 2... I got a new laptop and assumed that new laptop = has WSL2 as default