guillermocalvo/exceptions4c

Signal handling not compatible with exceptions

markmaker opened this issue · 0 comments

Hi @guillermocalvo,

thank you for publishing exceptions4c, this is a very nice project. I have some fundamental concerns though, so I hope you take the time to look at the following with an open mind, it is meant with the best of intentions.

Background

I've developed something very similar to exceptions4c 30 years ago, under Win16/Win32s, using macros. Later that was replaced by SEH under Win32, just by redefining the macros. SEH apparently also provides safe capturing of some hardware exceptions like dereferencing null pointers. Now I'm porting to Linux and I've seen your nice exceptions4c including signals, which would again allow me to capture null pointers, and perhaps new UNIX specific stuff.

Problem: Signals are asynchronous

But now I'm struggling to understand how exceptions4c could ever work correctly with signals. After encountering some problems, I read about signals in The Linux Programming Interface (which is brilliant), and unless I'm missing something fundamentally, I don't think this can ever work correctly at all.

Most signals are asynchronous, i.e. the thread will be preempted at any time, very unlike explicitly throwing exceptions synchronously. Normally, a signal handler (callback) will either abort the process, or do minimal stuff, like setting a flag, and then return from the handler, whereupon the thread is resumed from wherever it was interrupted. The resumed thread can then check the flag periodically, and again act synchronously when needed.

Signal handlers are only allowed to call a very limited set of so called "async-signal-safe functions", i.e. the use of most library functions and even some system calls is forbidden, because they may have been interrupted in the middle of an inconsistent transient state (regarding their global/heap data).

https://man7.org/linux/man-pages/man7/signal-safety.7.html

By using non-local goto a.k.a. longjmp() from the signal handler, exceptions4c is virtually extending the signal handler indefinitely, i.e. the remainder or the program is subject to the "async-signal-safe functions" restriction, which of course is not practical, except maybe for exception handlers that merely report a failure and then abort().

Even if this was not an issue, I believe you would also have to protect the signal handler from being interrupted itself, by more signals, i.e. you should then use sigaction() with signal masking instead of just signal(), and make sigsetjmp() and siglongjmp() mandatory, instead of alternatively allowing plain setjmp() and longjmp().

Similarly, you would have to mask the phase from sigsetjmp() until you pushed the new e4c_frame from being interrupted by signals (and likewise the popping phase). Otherwise, a signal could disrupt the frames (see also next section for more robust frames).

Finally, I also found that the default signal handling is disruptive. Standard functions such as system(), popen(), fork() etc. no longer work, due to SIGCHLD exceptions being thrown, and wait() no longer working and hanging. Took me very, very long to figure this one out, it was hidden in a closed-source library and it just hanged itself (ODBC driver, wanting to start the DB engine).

So to sum it up, I don't think exceptions based signal handling can be done. Even C++ cannot do it:

https://stackoverflow.com/questions/6535258/c-exceptions-and-signal-handlers#6536536

More robust frames

As already mentioned in the context of signal-interrupted frame push/pop: It would be better to find a design that no longer works with heap allocated memory and global context->current_frame but with local variables in the loop that can always be referred to correctly (i.e. locally) before/after any long jump. The e4c_frame should be allocated on the stack and automatically be "unwound" along with the long jump. This would also automatically "self-repair" when one is mistakenly return-ing or break-ing from the try loop. Obviously, the finally handler (if present) would still be skipped, but having the context->current_frame off by one indefinitely, is surely worse.

My simple 30-year-old design worked like that, but had its own restrictions, like only one try{ } allowed per function due to local variable name conflicts (back then using C89). I was happy with the restriction at the time, and guess it could be overcome, using extra nested blocks, more modern C, variable shadowing etc.

Again, hope you take it sportingly, and in case I should have missed something, just say it out loud.

_Mark