Timeout doesn't work with pathological child process--possible deadlock?
acruise opened this issue · 3 comments
Hi there, I'm using os-lib_2.13:0.3.0 in an automated student assignment evaluation system... We need to compile their C code and run the binaries to check the output, and sometimes they're... bad.
I have a particular student program that hangs early in its execution--I can fix the program, but that's not the point--students' programs will be doing all kinds of damage, and I need os.proc
to kill them if they hang. :)
The last two lines of the strace
of this student program are:
fstat(1, {st_dev=makedev(0, 0x18), st_ino=3, st_mode=S_IFCHR|0600, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=1024, st_blocks=0, st_rdev=makedev(0x88, 0), st_atime=1571542425 /* 2019-10-19T20:33:45.542742043-0700 */, st_atime_nsec=542742043, st_mtime=1571542425 /* 2019-10-19T20:33:45.542742043-0700 */, st_mtime_nsec=542742043, st_ctime=1571530834 /* 2019-10-19T17:20:34.582741621-0700 */, st_ctime_nsec=582741621}) = 0
futex(0xa01490, FUTEX_WAIT_PRIVATE, 2, NULL
When I run this program as a subprocess in os.proc
, there are three threads that are locked forever, here's the top of their stack traces:
main thread:
"main" #1 prio=5 os_prio=0 tid=0x00007ff7dc04e800 nid=0x678c in Object.wait() [0x00007ff7e2876000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000071f35bc08> (a java.lang.UNIXProcess)
at java.lang.Object.wait(Object.java:502)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
- locked <0x000000071f35bc08> (a java.lang.UNIXProcess)
at os.SubProcess.waitFor(SubProcess.scala:56)
at os.proc.run$1(ProcessOps.scala:182)
at os.proc.stream(ProcessOps.scala:186)
at os.proc.call(ProcessOps.scala:70)
one of the I/O threads:
"Thread-12" #24 prio=5 os_prio=0 tid=0x00007ff7dd417000 nid=0x67c8 runnable [0x00007ff7bdffb000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x000000071f35df28> (a java.lang.UNIXProcess$ProcessPipeInputStream)
at os.SubProcess$OutputStream.read(SubProcess.scala:177)
- locked <0x000000071f362260> (a os.SubProcess$OutputStream)
at os.SubProcess$OutputStream.read(SubProcess.scala:165)
- locked <0x000000071f362260> (a os.SubProcess$OutputStream)
at os.Internals$.transfer0(Internals.scala:17)
at os.proc$$anon$2.run(ProcessOps.scala:132)
at java.lang.Thread.run(Thread.java:748)
the other I/O thread:
"Thread-11" #23 prio=5 os_prio=0 tid=0x00007ff7dd415000 nid=0x67c7 runnable [0x00007ff7bdefa000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x000000071f35be68> (a java.lang.UNIXProcess$ProcessPipeInputStream)
at os.SubProcess$OutputStream.read(SubProcess.scala:177)
- locked <0x000000071f3600e0> (a os.SubProcess$OutputStream)
at os.SubProcess$OutputStream.read(SubProcess.scala:165)
- locked <0x000000071f3600e0> (a os.SubProcess$OutputStream)
at os.Internals$.transfer0(Internals.scala:17)
at os.proc$$anon$1.run(ProcessOps.scala:122)
at java.lang.Thread.run(Thread.java:748)
this is the bit of my code that's running the subprocess:
private def run_!(command: List[String], cwd: Option[File]): (Int, ByteString, ByteString) = {
val proc = os.proc(command)
val result = proc.call(
cwd = cwd.map(os.Path(_)).orNull,
timeout=5000L,
check=false
)
(result.exitCode, ByteString(result.out.bytes), ByteString(result.err.bytes))
}
Much as it pains me to have to say this: commons-exec
works fine here; it kills even the most stubborn subprocesses after the timeout elapses.
private def run_!(command: List[String], cwd: Option[File]): (Int, ByteString, ByteString) = {
val out = new ByteArrayOutputStream()
val err = new ByteArrayOutputStream()
val exec = new DefaultExecutor()
cwd.foreach(exec.setWorkingDirectory)
exec.setWatchdog(new ExecuteWatchdog(5000L))
exec.setStreamHandler(new PumpStreamHandler(out, err))
exec.setExitValues(null)
val cmd = new CommandLine(command.head)
cmd.addArguments(command.tail.toArray)
val exit = exec.execute(cmd)
val stdout = out.toByteArray
val stderr = err.toByteArray
(exit, ByteString(stdout), ByteString(stderr))
}
Wouldn't be surprised if there was some misbehavior in there. As a workaround, you could probably use os.proc.spawn
, Thread.sleep
, and .destroy
should be fixed in master I think