callbacks, lua states & coroutines
Closed this issue · 5 comments
Hi Stefano,
while playing around with multithreading I discovered a problem that can occur even if "real" multithreading is not involved: this problem may affect all lua callbacks that are registered in FLTK within a lua coroutine. (one could of course say, that this is in principle not a good idea...)
Take for example the idle callback that can be registered with fl.set_idle
. The invoking lua_State
is registered as void*
with Fl::add_idle
. However if the invoking lua_State
is a coroutine, the registered state can become invalid if it is garbage collected.
Another example is the Widgetcallback
, but this is more complicated: the struct moonfltk_ud_t
contains the lua state that was valid when the widget was created.
I'm now trying to think about a solution for this problem.... I hope, this could also lead to a deeper understanding for possible multithreading problems.
What do you think?
Best regards,
Oliver
Uhm... the general solution to prevent something being garbage collected is to anchor it to the Lua registry, but I am not sure this is what we want in the coroutine case. Rather, we'd like to unregister the callback.
As far as 'real' multithreading is concerned, I'm afraid that deeper understanding is never deep enough (or, at least, you can never know if it is deep enough). One sure problem is lua_State stack smashing if the same state is used across different threads, because the execution of Lua functions is not protected and another thread may kick in in the middle of it. I had to deal with these sort of things when I implemented LuaJack, and it was a real PITA...
Yes, the problem is tricky and should not be underrated. I'm currently working on the following idea that I hope makes things as easy as possible:
FLTK is highly non multithreaded, much is handled via global variables. All callbacks must live in the same "lua main thread". One must be careful about wording here, because in lua a thread is a lua_State
that belongs to a coroutine and must be distinguished from a "system thread". There is a lua main thread, that is the lua_State that was initially created via lua_newState
. This lua main thread can be obtained from every lua corotine thread that belongs to this lua main thread via lua_rawgeti(L, LUA_REGISTRYINDEX, LUA_RIDX_MAINTHREAD);
. In pure lua there is only one lua main thread possible. This lua main thread becomes invalid if the lua interpreter closes. However by using native modules with the lua C API one could have more than one lua main thread.
In all "real" lua multi-threading libraries that don't modify the lua interpreter the system threads use different lua main threads and are therefore completely separated and use some kind of inter-thread communication. There are also libraries that allow to create new lua main threads that are living in the same system thread.
Since everything is global in FLTK it would be natural to remember just one lua main thread in a global variable and use this in all callbacks. This lua main thread is where all callbacks and critical FLTK operations are to be performed.
To deal with this I implemented a solution that allows only one lua main thread to acquire the moonfltk module (this could als be done within a coroutine, just the corresponding lua main thread is remembered). All other lua main threads are only allowed to require a module named moonfltk.background
. In this background-module there are only uncritical functions that are save to be invoked in from other lua main threads (most important one is of course fl.awake
).
This solution will therefore forbid to modify widgets and register callbacks and so on in the background threads. All this has to be done in the lua main thread that first required the moonfltk module. It is also possible to use this from other lua coroutine threads, as long as these coroutines belong to the same lua main thread (let's call it the "foreground thread"). All other lua main threads are "background threads" that have restricted moonfltk access.
This solution therefore enforces the advise from the FLTK documentation section about multithreading:
Instead, the code should be designed so that the worker threads do not update the GUI themselves and therefore never need to acquire the FLTK lock.
You can see this work in progress in my github fork. It's still work in progress, it's too early to merge this in. If you find it more convenient, I could create a merge request for this and than we look over this merge request and I develop the merge request further until it can be merged.
What do you think of this approach?
I think it's viable.
In most other MoonLibs bindings I actually store the lua_State passed to luaopen_xxx() in a global variable (which I usually name moonxxx_L) because I need it for atexit() cleanups. If your solution works, it may be reused in those other bindings too.
Feel free to create the merge request.
Great. I will continue working on the solution. So far the current work in progress works, at least the cases I tested.