What are the requirements for unloading a library (`dlclose`)?

Question

What are the requirements for unloading a library (`dlclose`)?

VorpalBlade opened this issue a year ago · 8 comments

Side question of #525: what about dlclose?

As @RalfJung said, it is a "hornets nest". It is obvious an unsafe operation, but what are the specific safety requirements a user have to uphold to unload a library?

Off the top of my head:

That no references to the data or code about to be unloaded still exists, including static data from the library. Lifetimes (in particular 'static) will be a lie here. Is that OK if you ensure you no longer hold any references to it?
What about TLS variables and dlclose, this is platform/dynamic linker dependant as I understand it?

Have I missed any concerns? I'm not very familiar with anything except Linux, so there may be platform specific concerns as well.

Answer 1 · 2024-08-13T20:26:29.000Z

Lifetimes (in particular 'static) will be a lie here. Is that OK if you ensure you no longer hold any references to it?

Yes that's okay.

What about TLS variables and dlclose, this is platform/dynamic linker dependant as I understand it?

AFAIK it is an unmitigated disaster, which is why macOS just blocks unloading libraries with TLS entirely. On Linux it's not blocked but there is, to my knowledge, no sound way to unload such a library ever, so it might as well be blocked.

Answer 2 · 2024-08-13T20:57:49.000Z

Note that it's specifically TLS destructors, not TLS variables. Having TLS Variables in a to-be-closed DSO is fine, as long as there aren't any registered destructors. Though, but for pinning, there could be a way to avoid the issue, namely by marking TLS Destructor registrations from a closed DSO as finalized so they don't run at thread exit.

(This might be the solution that I use in LiliumOS)

Answer 3 · 2024-08-13T21:05:49.000Z

It's perhaps worth noting that macOS outright just doesn't unload dylibs in a number of cases that don't support unloading, despite returning a success value when requesting an unload.

TLS dtor behavior is already target dependent at program exit, so it's not particularly surprising that it'd be such for dylib unloading. (Although unmitigated UB is of course worse.)

For dlclose to be sound, I think you'd need to ensure:

No references to data owned by the dylib exist outside the dylib.
No function pointers to code exported by the dylib exist outside the dylib.
No symbols outside the dylib are linked to a symbol exported by the dylib.
No running threads are owned by the dylib.

In terms of the opsem modeling dlclose, I think that can straightforwardly be a forced unwind of any terminated threads (i.e. UB, currently) and then popping all borrow tags from all memory owned by the dylib (causing UB if any protectors exist), plus some way to mark the unloaded functions as UB to call.

Answer 4 · 2024-08-14T00:08:54.000Z

Also:

No outstanding TLS Object defined by the dylib have non-trivial destructors

Answer 5 · 2024-08-21T05:55:14.000Z

Someone is saying that at exit, some shared objects get unloaded as if by dlclose while other threads are still running. Does anyone know more about that? Which libc are doing this? Where is this documented? What exactly are the shared object affected here? Given how much of a foot-nuke dlclose is, this sounds like a terrible idea...

Answer 6 · 2024-08-22T10:48:24.000Z

It is worth noting that it looks like the comment @RalfJung mentioned above was in fact incorrect, and dlclose on exit doesn't happen (unless some library does it itself from atexit, but there is not much you can do about badly designed C libraries in general...)

Answer 7 · 2024-08-22T10:52:49.000Z

exit calls __cxa_finalize with NULL as the DSO pointer, which runs both global finalizers (atexit) and all loaded DSO-local finalizers (__cxa_at_exit with a non-null DSO identity pointer) in the total reverse order of registration.

To my knowledge, it doesn't unload the libraries except insofar as the entire address space gets unloaded when the process terminates.

Answer 8 · 2024-08-22T11:14:18.000Z

It is worth noting that it looks like the comment @RalfJung mentioned above was in fact incorrect, and dlclose on exit doesn't happen (unless some library does it itself from atexit, but there is not much you can do about badly designed C libraries in general...)

Okay, phew. That's good to hear. So we can focus on explicit calls to dlopen again here. :)