machinekit/machinekit-hal

Linking and symbol visibility pollution of managed runtime namespace

Opened this issue · 0 comments

cerna commented

During restructuralization for a new CMake based buildsystem (#200) I discovered an interesting problem with linker namespace symbol pollution.

There is a Submakefile in the rtapi subfolder defining executables rtapi_app, rtapi_msgd, shared library libhalulapi.so and MODULE library rtapi.so. These targets share some source files and also compiler flags (for now lets only talk about differentiation between RTAPI and ULAPI defines). There - theoretically - should be possible to link both rtapi_app and rtapi_msgd against libhalulapi.so.0 as both contain the same symbols and that way limit compiling the same file into only one shared library. This will also help with structuralizing the source tree in a more transparent and well arranged way.

Problem is, the rtapi_app hold the Managed runtime in which the real-time capable modules run. (For lack of better terminus technicus for the RT capable space. If you have a better idea how to name it, I am interested.) The DSO exported by these RT capable modules (which include the rtapi.so and hal_lib.so) are strictly manually managed with use of EXPORT_MODULE() macro. However, if one was to link the rtapi_app against the linhalulapi.so.0, he would get the DSO from compilation of both RTAPI and ULAPI defines. Respective, this is already happening within rtapi_app because of rtapi_support.c file and is the reason for the dire warning of symbol visibility mismatch.

Fortunately, modern glibC has a way to solve this problem called linker namespaces created with the use of dlmopen() call. Unfortunately, it does not really work. The main issue is `RTLD_GLOBAL' - Machinekit HAL requires this option to allow automatic linking of symbols from library loaded first to symbols of library loaded as second one. GNU libC developers are saying that this makes no sense:

/* It makes no sense to use RTLD_GLOBAL when loading a DSO into
a namespace other than the base namespace. */

Which for Machinekit-HAL's use case is clearly not true. There is even RFC for the usage of RTLD_GLOBAL with dlmopen(), however I don't think anything will happen in short term.

Then there is the RTLD_SHARED proposal with which you should be able to share a symbol from the main namespace to the newly created one. This is something Machinekit-HAL Managed runtime also could make use of.

The most interesting piece and possible solution is libcapsule from Vivek Das Mohapatra, proposer of aforementioned RTLD_SHARED, from a company called Collabora. It's basically a library based on top of dlmopen() call with few add-ons rewriting the linker relocation table on loading targeted at containerized runtime distribution of games.

I am not sure that libCapsule could be used as-is - which would be best as the maintenance task would be minimal, or if some patching would be required. Both Machinekit-HAL's use-case and the OpenGL one are different.

In the short term, the use of interim transitive INTERFACE libraries providing the source files to get around the visibility/scope problem is the easiest solution for the CMAKE remake.