dotnet/Silk.NET

Can't load native libraries on Pop!_OS

Closed this issue · 5 comments

Summary

Silk.NET seems to have some native dependency resolving logic that fails to find the correct folder in runtimes/ on Pop!_OS 22.04.

Steps to reproduce

  • Platform: Desktop, Pop!_OS 22.04 LTS, .NET SDK 9.0.106
  • Framework Version: targetting .NET 8.0
  • API: SDL (but probably the rest too, that's just the first failure)
  1. Try to import/use SDL from Silk.NET (just GetApi() is enough to trigger it)
  2. Run program with dotnet run

Comments

Running the program with strace indicates that the paths being searched are bin/Debug/net8.0/runtimes/pop.22.04-x64/native/libSDL2-2.0.so, followed by a bunch of system library paths. Obviously, the actual library it should be finding is in runtimes/linux-x64 and not runtimes/pop.22.04-x64.

I looked into the logic for selecting this path and it seems that this RID is coming from Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier, in two places:

var currentRid = RuntimeEnvironment.GetRuntimeIdentifier();

and

foreach (var rid in GetAllRuntimeIds(RuntimeEnvironment.GetRuntimeIdentifier(), DependencyContext.Default))

I saw inside Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier that this can be overridden by an environment variable, so I tested by setting export DOTNET_RUNTIME_ID=linux-x64 and this did allow packages to load, but this isn't really a "proper" solution.

I saw here that this package is removed from newer versions and on NuGet it seems to not have been updated in almost 5 years.

I also tested and found it gave a different result to the built-in API:

> Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier()
"pop.22.04-x64"
> System.Runtime.InteropServices.RuntimeInformation.RuntimeIdentifier            
"ubuntu.22.04-x64"

I'm really not sure why this is so broken, there was another bug recently where we're not checking the runtimes directory of the output in cases where a RID has not been specified at build time, even though we should be doing that.

Will accept PRs here with open arms.

I wish I could help more but it kind of feels like if I wanted to get anywhere I'd end up tearing out the library loading logic and replacing it wholesale, which I don't feel confident enough here to do.

e.g., just using the builtin System API to do:

NativeLibrary.Load("libSDL2-2.0.so", Assembly.GetExecutingAssembly(), DllImportSearchPath.ApplicationDirectory);

Works perfectly on my platform and runtime etc, but maybe Silk.NET has custom resolving logic for compatibility with other platforms that I don't have the ability to work with.

So we do eventually boil down to that, but we do a bunch of additional stuff to pick the right binaries out of NuGet packages for instance so that it works when dotnet running. However, the first thing we should be trying is NativeLibrary.Load("libSDL2-2.0.so") (i.e. we're meant to just pass through the value in the first pass) and then going into our custom logic once that doesn't work.

This is actually one of the easiest parts of Silk.NET to work on, the Silk.NET.Core.Loader namespace is quite self-contained and all of our path resolving logic is in DefaultPathResolver (a single file that is quite easily grokkable).

I did some further testing and noted that on ubuntu, where it works fine, I get these RIDs:

> Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier()
"ubuntu.22.04-x64"
> System.Runtime.InteropServices.RuntimeInformation.RuntimeIdentifier
"ubuntu.22.04-x64"
>

On Ubuntu, ctx.RuntimeGraph is then used to get the fallback RIDs for ubuntu.22.04-x64, one of which is linux-x64, and the library is then found in that location. Notably RuntimeGraph does not contain a full graph, and only has a single entry just for ubuntu.22.04-x64.

On Pop!_OS, as I mentioned earlier, the builtin RID property returns an ubuntu version and the one from Microsoft.DotNet.PlatformAbstractions (which is probably outdated, as it has not been updated in a very long time) returns a pop.22.04-x64 RID. However, the RuntimeGraph only contains fallbacks for ubuntu.22.04-x64. I can verify that this works by setting the override RID environment variable to be ubuntu.22.04-x64, which also allows packages to load on Pop!_OS.

From what I can tell, the RuntimeGraph seems to come from /usr/lib/dotnet/shared/Microsoft.NETCore.App/8.0.16/Microsoft.NETCore.App.deps.json, and it contains just the Ubuntu fallbacks, because that's all it expects the application to need to run on this system. The issue is that .NET Core seems to have made the decision to consider Pop!_OS as basically Ubuntu for RID purposes (which makes a lot of sense to do) but Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier has not got the memo. This could be related to the breaking changes made by .NET 8 (here and here).

In my case, I'd fix it by switching to the inbuilt RuntimeIdentifier API, but sadly according to the docs support for this only goes as far back as .NET 5, and Silk.NET is supposed to support netstandard 2.0 so my understanding is this wouldn't be usable. I know that Silk.NET 3 is in development targeting .NET 6 - so in that case it's probably worth leaving any meaningful rewrite to then, and just bodging it in Silk.NET 2 for the time being. Especially so as it seems there's already the GuessFallbackRid method which fixes a bunch of specific cases. For now, I'll submit a PR to add a rule to there to correct pop to ubuntu. Just give me a bit to figure out how to build and test with this repo.

Further note: I just realised what is probably the issue - the dotnet SDK I have from my package manager was built for ubuntu. This is very normal on Pop!_OS as the distro is heavily based on Ubuntu. That's why my runtime graph contains the fallbacks for ubuntu. The loader is searching for the OS that I'm running, when it should be searching for the OS that the runtime I have installed was built for. This doesn't really change the remediation steps though, for now the easiest thing is to just add a case for it. In the future, when Silk.NET is .NET Core-only, using a more modern API like NativeLibrary (which again isn't available in netstandard) should hopefully work a lot more reliably.