xd009642/tarpaulin

Spurious failure "ESRCH: No such process"

Closed this issue · 7 comments

Rarely, it happens that when running tarpaulin on a test suite, a test fails with the following error:

[ERROR tarpaulin] Failed to get test coverage! Error: Failed to run tests!
Error: Error occurred when handling ptrace event: ESRCH: No such process

I didn't really find a way to consistently reproduce it, but it happens from time to time.

Interesting... I'm going to add a new --debug flag to tarpaulin to print out a lot more information to aid in debugging tracing based issues. I'll let you know when it's ready

Here is a debug output I managed to reproduce, if that helps.

The test code is as follows, with a dependency on futures-executor = "0.2.1"

#[test]
pub fn a() {
    futures_executor::ThreadPool::new();
}

#[test]
pub fn b() {
    futures_executor::ThreadPool::new();
}

and the debug log is:

[TRACE tarpaulin] Debug mode activated
[INFO tarpaulin] Running Tarpaulin
[INFO tarpaulin] Building project
[DEBUG tarpaulin] Processing tarpotest
[INFO tarpaulin] Launching test
[INFO tarpaulin] running /home/levans/test/tarpotest/target/debug/deps/tarpotest-a3b55d59815e497a
[TRACE tarpaulin] Adding trace at address 0x403650 in src/lib.rs:2
[TRACE tarpaulin] Adding trace at address 0x403654 in src/lib.rs:3
[TRACE tarpaulin] Adding trace at address 0x403674 in src/lib.rs:8
[TRACE tarpaulin] Adding trace at address 0x403670 in src/lib.rs:7
[TRACE tarpaulin] Adding trace at address 0x403b10 in src/lib.rs:2
[TRACE tarpaulin] Adding trace at address 0x403b30 in src/lib.rs:7
[TRACE tarpaulin] Test PID is 13759
[TRACE tarpaulin] Caught inferior transitioning to Initialise state
[TRACE tarpaulin] Initialised inferior, transitioning to wait state

running 2 tests
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b10
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b10
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403650
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403653
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403654
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403656
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13760
[TRACE tarpaulin] Caught signal Stopped(Pid(13760), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13761
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13762
[TRACE tarpaulin] Caught signal Stopped(Pid(13762), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13763
.[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b30
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b30
[TRACE tarpaulin] Caught signal Stopped(Pid(13763), SIGSTOP)
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403670
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403673
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13763), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403674
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403676
[TRACE tarpaulin] Caught signal Exited(Pid(13763), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13762), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Stopped(Pid(13761), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13760), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13762), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13761), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13761), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13764
[TRACE tarpaulin] Caught signal Stopped(Pid(13764), SIGSTOP)
[TRACE tarpaulin] Caught signal Exited(Pid(13760), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13765
[TRACE tarpaulin] Caught signal Stopped(Pid(13765), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13766
[TRACE tarpaulin] Caught signal Stopped(Pid(13766), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13767
[TRACE tarpaulin] Caught signal Stopped(Pid(13767), SIGSTOP)
.
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13764), SIGTRAP, 6)
test result: [TRACE tarpaulin] Child exiting
ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13765), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13764), 0)
[TRACE tarpaulin] Caught signal Exited(Pid(13765), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13766), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[ERROR tarpaulin] Failed to get test coverage! Error: Failed to run tests! Error: Error occurred when handling ptrace event: ESRCH: No such process

That potentially helps... For some reason 13767 is being killed off without me catching it.

If I had to hazard a guess - ptrace is poorly designed in some areas and one of them is that PTRACE_CONTINUE sends a continue command to every thread in the process. Given the threadpool seems to spawn 3 threads that immediately exit this might cause a race condition in some instances where one PTRACE_CONTINUE continues the exit on 2+ threads before I handle the signal for the second.

That sounds somewhat feasible to me, I'll add some more debug printouts through the coming days and try some things out. Hopefully, this will lead to a solution

@vberger could you see if this issue still persists with the latest version on the develop branch?

I can still reproduce it, though it seems to occur more rarely (occurred once in around 30 runs).

@vberger I'm pretty sure this is solved by the fix in #190, I'm currently releasing 0.9.0 if you could try to recreate and let me know if I can close this. Otherwise I'll close it in a few days and reopen it if there are any updates

I cannot seem to reproduce it any more. 🎉

I'll reopen if I ever see it again.