with child processes, fix toggle_blocking(x, 1) on stdio
jo-he opened this issue · 7 comments
The problem here is, that the file status flags are specific to a file descriptor, which is shared between a parent and a child process after fork()
. So removing O_NONBLOCK
in the child (as the VM currently does on stdin/stdout/stderr before exec()
and exit()
), propagates back to the parent, so that the I/O there does not any longer work as expected.
The toggling at exit()
time is probably not required, as it should not propagate back to the parent across exec()
boundaries (i.e. to the process that started Owl – but I need to verify this on different operating systems). But for the exec()
case I currently see no easy (portable) solution.
Update: File status flags do propagate back to the parent process even across exec()
boundaries. I wrote the following program to verify this:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int do_child (void)
{
int fl0, fl1;
fl0 = fcntl(STDIN_FILENO, F_GETFL);
if (fl0 == -1)
return 1;
fl1 = fl0 & ~O_NONBLOCK;
if (fl0 == fl1) {
printf("%s\n", "-> O_NONBLOCK is not set in child process");
return 2;
}
if (fcntl(STDIN_FILENO, F_SETFL, fl1) == -1)
return 3;
fl0 = fcntl(STDIN_FILENO, F_GETFL);
if (fl0 == -1)
return 4;
if (fl0 & O_NONBLOCK)
return 5;
return 0;
}
int main (int argc, char *argv[])
{
unsigned int op;
/* self-executed? */
if (argc == 2)
return do_child();
for (op = 0; op != 2; ++op) {
pid_t pid;
int fl0, fl1, ret;
/* initialize */
fl0 = fcntl(STDIN_FILENO, F_GETFL);
if (fl0 == -1)
return 6;
fl1 = fl0 | O_NONBLOCK;
if (fl0 != fl1) {
if (fcntl(STDIN_FILENO, F_SETFL, fl1) == -1)
return 7;
fl0 = fcntl(STDIN_FILENO, F_GETFL);
if (fl0 == -1)
return 8;
}
if (!(fl0 & O_NONBLOCK))
return 9;
printf("test file status flag propagation across %s()\n", op ? "fork() + exec" : "a simple fork");
pid = fork();
if (pid == 0) {
if (op) {
char *args[] = { argv[0], "1", NULL };
execv(argv[0], args);
return 10;
}
return do_child();
}
if (pid == -1)
return 11;
if (waitpid(pid, &ret, 0) != pid)
return 12;
if (!WIFEXITED(ret))
return 13;
ret = WEXITSTATUS(ret);
if (ret != 0)
return ret;
fl0 = fcntl(STDIN_FILENO, F_GETFL);
if (fl0 == -1)
return 14;
printf("-> O_NONBLOCK has%s been reset by child\n", fl0 & O_NONBLOCK ? " not" : "");
}
return 0;
}
Currently, the easiest way to solve this, is probably trying to re-open stdio via /dev/std{in,out,err}
, even if this is not necessarily portable. I'll do some investigation to write some code which does this as smart as possible.
Good catch. I didn't expect the flags could be inherited across process boundaries. Kind of makes sense, though, as file descriptors are more of a kernel resource than a process one.
Would remapping unique stdio file descriptors be a portable solution for this? Then one could make sure the same fds are not used by the child and the parent.
Related to this - there should be a (toggle-blocking fd bool), so that this kind of stuff could be done from lisp side.
Unfortunately, re-opening a file descriptor via /dev/std{in,out,err}
or /dev/fd/[0-2]
seems to make a difference on Linux only. On most other Unices, opening via these special files has the same effect like dup()
:
https://www.freebsd.org/cgi/man.cgi?query=fd&sektion=4&manpath=FreeBSD+12-current
http://netbsd.gw.com/cgi-bin/man-cgi?fd+4+NetBSD-7.1
https://man.openbsd.org/fd.4
https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man4/fd.4.html
https://www.unix.com/man-page/opensolaris/4/fd/
I actually verified this on OpenBSD and AIX. And therefore I see no portable way to create a new set of stdio file descriptors that has its flags independent from the parent process.
Maybe the most portable solution would be to avoid the use of O_NONBLOCK (at least on stdio), and always call select()
or poll()
before doing I/O. But then, large writes could still block on sockets or pipes…
Thanks, you've made a good analysis of the problem. It is somewhat amusing, that reading and writing data portably is still causing issues in 2018.
The semantics of I/O are intended to be that of message passing between threads, which is asynchronous message passing. Initially the file descriptors had actual threads around them, and system calls were performed only from the responsible thread. The problem was, that since all threads were only aware of their own file descriptor, one could not sleep properly when polling for input from two or more file descriptors.
In the current iteration, the thread controller collects information about file descriptors that would block in the next operation, and wakes the threads that wanted to do blocking I/O on them when something relevant happens.
Looks like this will again require some attention. I guess the first step is to figure out what primitives are available portably.
During research I also stumbled across this:
https://jdebp.eu/FGA/dont-set-shared-file-descriptors-to-non-blocking-mode.html
And the most portable way really is to use poll()
or select()
.
Stdio should now remain blocking and other file descriptors are handled as usual. Does this seem to solve the issue adequately?
Yes, I think we can close this for now. The only ugly thing that remains is, that dup()
to stdio usually leaves a file descriptor >2 in blocking mode behind. In most common cases, that one is not used and closed right afterwards, so from my current view it is quite unlikely to hit this issue.