dotnet/runtime

Support custom hosting a hybrid between app and component

rseanhall opened this issue · 38 comments

In .NET Core 3.x, work was done to make custom hosting easier. Support was added for an "application" and a "component". I need support for a hybrid between these two.

The general use case is for an application where its features are spread across native and managed code, with the entry point being in native code. The details of my specific use case are here: #34946.

It's like an "application" but not "component":

  • This is the main application.
    -Missing dependencies are an error.
    -Loading into default ALC is expected.
    -Any isolation must be implemented by the app itself.
  • It may be deployed as self-contained or framework-dependent.

It's like a "component" but not an "application":

  • No static void Main(string[] args), no concept of args
  • Entry point is custom
  • Managed code needs to continue running after execution has passed back to native code

Unique aspects for this hybrid scenario:

  • The native code portion will initialize the .NET Core runtime very early in the lifetime of the process.
  • If deployed as self-contained, then it was published with a RID compatible with the native code portion of the application.

Reasons to be implemented in .NET Core instead of by the custom host using coreclr directly:

  • Logic to find the framework requested by the app is very complex and is subject to change with each new release of the framework.
  • Building the TPA list from deps.json files is very complex and is subject to change with each new release of the framework.
  • Various documentation discourages using coreclr directly, and some even make it seem like it could be deprecated in the future.

Proposal

Original proposal Add two new exported functions to `hostfxr` - `hostfxr_initialize_for_managed_host` and `hostfxr_create_delegate`. These two functions are similar to the existing `hostfxr_initialize_for_dotnet_command_line`/`hostfxr_run_app` and `hostfxr_initialize_for_runtime_config`/`hostfxr_get_runtime_delegate` function pairs in that the handle created from `hostfxr_initialize_for_managed_host` can only be used with `hostfxr_create_delegate`.
//
// Initializes the hosting components using an application
//
// Parameters:
//    managed_host_path
//      Path to the managed host assembly file
//    deps_json_path
//      Optional. Path to the .deps.json file
//    runtime_config_path
//      Optional. Path to the .runtimeconfig.json file
//    parameters
//      Optional. Additional parameters for initialization
//    host_context_handle
//      On success, this will be populated with an opaque value representing the initialized host context
//
// Return value:
//    Success          - Hosting components were successfully initialized
//    HostInvalidState - Hosting components are already initialized
//
// This function will find the .runtimeconfig.json and .deps.json corresponding to the given 
     application
// with which to resolve frameworks and dependencies and prepare everything needed to load the runtime.
//
// This function does not load the runtime.
//
typedef int32_t(HOSTFXR_CALLTYPE *hostfxr_initialize_for_managed_host_fn)(
    const char_t *managed_host_path,
    const char_t *deps_json_path,
    const char_t *runtime_config_path,
    const struct hostfxr_initialize_parameters *parameters,
    /*out*/ hostfxr_handle *host_context_handle);

//
// Create a native callable function pointer for a managed method from the currently loaded CoreCLR or from a newly created one.
//
// Parameters:
//     host_context_handle
//       Handle to the initialized host context
//     entry_point_assembly_name
//       Name of the assembly which holds the custom entry point
//     entry_point_type_name
//       Name of the type which holds the custom entry point
//     entry_point_method_name
//       Name of the method which is the custom entry point
//     delegate
//       Output parameter, the function stores a native callable function pointer to the delegate at the specified address
//
// Returns:
//     The error code result.
//
// The host_context_handle must have been initialized using hostfxr_initialize_for_managed_host.
// The assembly is loaded into the default AssemblyLoadContext.
//
typedef int32_t(HOSTFXR_CALLTYPE *hostfxr_create_delegate_fn)(
    const hostfxr_handle host_context_handle,
    const char_t *entry_point_assembly_name,
    const char_t *entry_point_type_name,
    const char_t *entry_point_method_name,
    /*out*/ void **delegate);

hostfxr_initialize_for_managed_host is essentially the same as hostfxr_initialize_for_dotnet_command_line but instead of argc/argv, it takes the application, deps.json, and runtimeconfig.json as explicit parameters.

hostfxr_create_delegate is simply a wrapper around the create delegate functionality of coreclr. hostfxr_get_runtime_delegate could not be reused since it always loads the target assembly into an isolated ALC (which is why it requires a handle from hostfxr_initialize_for_runtime_config - isolation is not supported in self-contained applications and it blocks on SCD).

Design document update and discussion: #36990

Tagging subscribers to this area: @vitek-karas, @swaroop-sridhar
Notify danmosemsft if you want to be subscribed.

We've been discussing something like this recently, thanks for bringing this up again and actually starting the technical solution.

I do like the description of "it's like app and it's like component at the same time" 👍 . Although these two are very related I think it might be beneficial to handle them separately:

Allow self-contained runtime in more scenarios

This can be achieved in two ways:

  • Allow loading a self-contained runtime via hostfxr_initialize_for_runtime_config.
  • Allow using hostfxr_initialize_for_dotnet_command_line with hostfxr_get_runtime_delegate.

Both can be done, the main differences are:

  • hostfxr_initialize_for_dotnet_command_line effectively require the managed "component" to be built like an application (`.exe)
  • hostfxr_initialize_for_dotnet_command_line assumes a command line on the input and support parsing some of the hosting layer command line options.
  • Going forward we may add support for single-file .exe into the hostfxr_intialize_for_dotnet_command_line, but not the other API.

The one which I've seen recent requests for is much more in line allowing SCD with hostfxr_initialize_for_runtime_config.

The obvious limitation would be that SCD .runtimeconfig.json is the only one allowed to be loaded (so it has to be the first and the only one, we would not allow combining SCD and FDD in any way).

Note: Technically it might be possible to allow SCD to be loaded first and then any number of FDDs if they match as that is basically the same scenario as loading plugins to SCD app - which is supported. But for now we can disallow that for simplicity and for lack of scenarios.

Scenario where this ability is useful without the second (loading into default load context): Native app which needs to load multiple managed components which are all implemented together in one "large" SCD project but still want ALC isolation for some reason (maybe in the future combined with unloadability). I don't have a specific example in mind right now - but it doesn't sound too crazy.

Allow loading "components" into the default load context

This has been actually discussed during the initial proposal for the new hosting APIs. In the end we decided not to do this as we didn't have enough scenarios to justify the added complexity. But it is a natural extension. To make this possible we actually designed the API to allow it in the future. The load_assembly_and_get_function_pointer_fn takes a void * reserved parameter which was intentionally added to allow extending the capabilities of this method in the future. When this was added we had in mind mostly two cases:

  • Allow loading into default load context (what you're looking for)
  • Allow loading into collectible load context (allow unloading the secondary load context)

Please note that while load_assembly_and_get_function_pointer_fn currently always loads into secondary ALC, there's nothing in the hosting layer which assumes that's the case. The hostfxr_get_runtime_delegate makes no assumptions for this (so we don't need a new hostfxr entry point) and the load_assembly_and_get_function_pointer_fn also doesn't hardcode this in the API in any way (other than it's the only one which works right now).

I would probably do this by defining a new struct to pass in the reserved parameter:

struct load_assembly_and_get_function_pointer_parameters
{
    size_t size; // Acts as a version number effectively
    bool load_into_default_load_context;
}

And then redefine the load_assembly_and_get_function_pointer_fn like this:

typedef int (CORECLR_DELEGATE_CALLTYPE *load_assembly_and_get_function_pointer_fn)(
    const char_t *assembly_path      /* Fully qualified path to assembly */,
    const char_t *type_name          /* Assembly qualified type name */,
    const char_t *method_name        /* Public static method name compatible with delegateType */,
    const char_t *delegate_type_name /* Assembly qualified delegate type name or null */,
    const load_assembly_and_get_function_pointer_parameters *parameters /* additional parameters */,
    /*out*/ void **delegate          /* Pointer where to store the function pointer result */);

Existing code would keep working as-is, and new code can pass in the parameters and alter the behavior.

This behavior has actually been asked for already without the SCD support - so we know there are scenario which want this regardless of the first change.

Given that this is a non-trivial change to the native hosting design, could you please start a separate PR (separate from the implementation PR) to update the design doc. That would be a good place to have the detailed design discussion. The doc also has some more details on interactions of the various APIs and hosting modes and it should help us make sure that the proposed changes play nice with the existing functionality.

Can you give me an idea of whether this can make it in .NET 5? Will the added complexity of getting this working in the existing API's affect if it does? To be honest, using your proposal would require a lot of additional work and I don't want to spend time on this right now if it has to wait for .NET 6.

This can definitely make 5.0 (there's still probably around 2 months before any kind of lock down for the release). I'll have to think about this some more:

  • I would like to avoid introducing new APIs unless absolutely necessary (it makes it hard to explain which APIs should be used for what).
  • I realized that my proposal introduces a weird discrepancy with regard to TPA behavior - for FDD TPA would have just frameworks, for SCD it would have everything. Fixing that basically means fixing dotnet/sdk#3339 and teaching how to initialize hostfxr in libhost mode over SCD.
  • What you're proposing is basically hostfxr_intitialize_for_runtime_config but with running the hostfxr in something like an application mode (you introduced a new mode, maybe that's the best option - I don't know yet). Maybe this should be explicit, use hostfxr_initialize_for_runtime_config and add an option to run in this new mode (either app mode, or completely new).
  • One way to do that would be to add a deps.json path parameter to hostfxr_initialize_parameters - this could be used even for hostfxr_intialize_for_dotnet_command_line, there's no discrepancy there. Naming would be key here, to make it clear that this changes how the system behaves (to a degree).

Honestly the cleanest way to fix this would be to fix the SDK issue described above, but that's not easy to coordinate.

I assume you have a specific scenario in mind when proposing this - what are the aspects of this which you need:

  • Is SCD a must?
  • Loading into default ALC - this can be done with the current design - it's just a bit cumbersome (load a small shim into custom ALC -> that loads the main thing into default ALC - needs custom resolver event handler, but doable).
  • Why is it important to load into default ALC? (Most things should by now work in custom ALC - although there are some known limitations).
  • Could you use a slightly different approach? Basically up side down. Managed would be the main driver and it would call into native. You could even have native init (using the hostfxr APIs), but the main thread would be owned by the managed Main.

I understand that adding new APIs to solve the problem is relatively clean in this case, but it would mean to maintain those going forward (basically forever). So we need to make sure that whatever new concepts we introduce make good sense and play well with everything else.

The specific scenario is the one in the issue I linked to - the Wix Toolset's bootstrapper functionality. It must start native because it's an installer, so it can't rely on anything being on the target machine. SCD support is a must - this allows users to write the BA in managed code without any dependencies being on the target machine.

Loading into the default ALC is required to be able to workaround any bugs, limitations, etc. around running in a non-default ALC that are lurking in .NET Core. I also feel like I need to point out that you can't currently shim around this in the SCD scenario because you can't even call hostfxr_get_runtime_delegate.

Do we need to decide on using the existing APIs vs new ones here or should I figure out how to write both of them up in a design PR?

I'll discuss this with couple of people on our end (calls - as that's faster and I will update the issue with any notes I'll have).

Over night I realized two additional things we need to consider:

SDK support for SCD components

.NET Core SDK currently doesn't really support building self-contained "components" - that is classlib projects. Unfortunately this has been "broken" for a while: dotnet/sdk#2549. It's possible to force it on the command line, but the result is "unknown" - it might work, but it might not - we simply don't test this scenario. So in this sense the only "good" way to produce an SCD is building it as an application.

Allow usage of hostfxr_initialize_for_dotnet_command_line together with hostfxr_get_runtime_delegate

This might be actually an easier way to enable the scenario. There are some complexities in hostpolicy around synchronization of the contexts, but other than that it might be relatively easy to enable this. This would also naturally lead to loading everything (the app as well) into the TPA and then just adding an option to load the "component" from the default ALC.
Given the SDK support mentioned above - since the right way to build SCD is to build an app, it feels natural that such payload should be initialized via hostfxr_initialize_for_dotnet_command_line.

I can't help but notice that you seem to keep on trying to frame this hybrid scenario as a component, but it really doesn't fit into any of the design decisions that have been made around them. It is essentially an application. hostfxr_get_runtime_delegate was designed and implemented around components. There are many assumptions and limitations in the implementation that it would only be used for components. So I don't think it's a good idea to try to use that for this scenario.

I don't mind reusing the hostfxr_initialize_for_dotnet_command_line entry point for initialization for this hybrid scenario, this is actually what I'm going to use so I can support 3.x (and then I'll load coreclr.dll directly and use CreateDelegate from there). The only problem I have with it is that it's not very discoverable - the args parameter makes it seem like you have to know a lot of internal details to know what to pass in. I would really like to see the CreateDelegate functionality from coreclr exposed from hostfxr since it's much nicer to use. The types don't have to be assembly qualified, and the implementation is powerful enough for you not to have to define your own delegate if you need a custom signature.

I chatted with @elinor-fung and we agreed that the app approach is something which makes sense and should work:

  • Use hostfxr_initialize_for_dotnet_command_line to initialize the context (and build the managed portion as a full blown app - with dummy Main. This is actually similar to what for example unit test projects do - they also build a dummy Main which is never called)
  • Remove the block from hostfxr_get_runtime_delegate and allow it in combination with the app initialized context from above. We think this is basically just removing the block - it should work as is. (will need a more detailed code review, but it was designed with this at least a little bit in mind, we just didn't have a scenario for it).
  • Add the option to load_assembly_and_get_function_pointer_fn to load into default ALC - and make that work for both app and component initialized contexts. (This is basically the most work as it requires some hooking up of the assembly dependency resolver).

To react to the CreateDelegate question: I don't think that the hostfxr_get_runtime_delegate is necessarily limited to components. It is currently hard coded to that - to reduce the testing matrix we need to cover, but there's really nothing in its design which should prevent it working on apps.

I do agree that the necessity of a delegate is unfortunate. The underlying reason is that it's implemented fully in C# and currently C# doesn't have a way to express the "random" delegate. This should be possible to "fix" when C# supports function pointers.

Exposing the coreclr_create_delegate directly: You can basically do that today if you use the hostfxr to populate the runtime properties for you, but not start the runtime. You can then start the runtime manually and call that. We tried to not expose it directly mainly because it has the limitation of only supporting assemblies from the TPA. Which works for your scenario, but doesn't work for most of the component scenarios.

I will chat with couple more people tomorrow, but right now I don't think the main ideas will change that much.

Would the above proposal work for you?

That proposal works for me. But is the team really willing to let the native host potentially call load_assembly_and_get_function_pointer_fn to load into an isolated ALC with SCD? Based on the existing implementation, it seems like the goal was to block all unsupported scenarios. Adding hostfxr_create_delegate like I did allows keeping all the blocks in place until it's fully supported.

Isolated ALC on SCD is almost fully supported as of 3.1. The only missing bit is the lack of framework/non-framework assembly information, but that is only important if the loaded component overrides framework assemblies - which is very rare (we know some apps do it in ASP.NET, but not many and the need for this is decreasing over time). This is basically the same as loading managed plugin into SCD app - should work just fine in 3.1 and 5.

The reason I'm hesitant adding hostfxr_create_delegate is that it would only work on app initialized contexts, it would NOT work on component contexts. And that is pretty confusing - especially in combination with the existing hostfxr_get_runtime_delegate.

The lack of framework/non-framework assembly information also means that framework assemblies can get loaded into the isolated ALC, when they should only ever be loaded into the default ALC.

That should really only happen if the component is built wrong (in 99% of cases).

  • The runtime doesn't prevent framework assemblies loaded into secondary ALC - regardless of it knowing what is framework (it actually never knows that right now).
  • AssemblyDependencyResolver will still return framework assembly even on FDD if the component being loaded contains that assembly in it. The only difference is that in FDD it will perform version checks and the higher version assembly will be returned, with SCD it will always return the one from the component.

So in short - if the component contains a framework assembly in it - it's a problem. For FDD it's not such a big problem, it should mostly work well. For SCD it may work a little less due to the above mentioned issue in SDK.

With current SDK on 3.0 and above there are only a few assemblies which can be included in the app and the framework at the same time - there are only a few assemblies which are part of the framework but we also publish them as separate NuGet packages which is the only normal way to get such assembly into the component. That is obviously ignoring intentional manipulation of files and such - but I discount that as a reasonable scenario.

I'm a little confused because in this scenario we are loading the app not a component. If it's SCD, then of course it's going to have framework assemblies in its deps.json.

I thought framework assemblies are only supposed to be loaded once and that problems could occur if they're loaded into multiple ALCs. Also, I thought that framework assemblies are not supposed to be unloaded, which might not be possible in this scenario today but could be if the option to make the isolated ALC collectible is added later.

You're right - I didn't realize this. For the SCD scenario loading into secondary ALC is probably a bad idea. Loading into default should work just fine.

But it's not as clear cut - I could hostfxr_initialize_for_dotnet_command_line an SCD app and then get a delegate for some completely unrelated plugin - in which case loading into secondary ALC is actually preferable.

Right now I don't know if there's something we can do in the code to make this easier... it's "Easy" to document, but that tends to not work very well.

The reason I'm hesitant adding hostfxr_create_delegate is that it would only work on app initialized contexts, it would NOT work on component contexts. And that is pretty confusing - especially in combination with the existing hostfxr_get_runtime_delegate.

It took me a while to formulate my response to this. I believe that it would only work on app initialized contexts, it would NOT work on component contexts is the defining characteristic of this new concept - it's not an "application" and it's not a "component".

  • An "application" will start the runtime by populating the TPA with frameworks and the app, then call coreclr_execute_assembly.
  • A "component" will start the runtime (assuming it wasn't already started) by populating the TPA with only the frameworks, then call coreclr_create_delegate to a framework provided method that loads the component into an isolated ALC.
  • This third concept will start the runtime by populating the TPA with frameworks and the app, then call coreclr_create_delegate to a method in an app assembly.

I called this third concept "managed host" in my initial solution. In my scenario there are basically three parts of the application - native, managed, and .NET Core host. The .NET Core host has a native dll (native host) and a managed dll (managed host). The native host has (what was supposed to be) a simple job - start the runtime and call into the managed host. The managed host is responsible for getting the managed part of the application up and running.

Originally I didn't think it really mattered whether to use load_assembly_and_get_function_pointer_fn or coreclr_create_delegate, but now I strongly believe that the "managed host" requires the functionality from coreclr_create_delegate. It might be possible to add that functionality to hostfxr_get_runtime_delegate, but I still think that's a bad idea. Maybe people skimming the exports of hostfxr could get confused between hostfxr_get_runtime_delegate and hostfxr_create_delegate, but if the former requires hostfxr_initialize_for_runtime_config and the latter requires hostfxr_initialize_for_dotnet_command_line then I think that's acceptable. In fact, it helps to make the "managed host" concept more discoverable.

@rseanhall I think I may have missed it, why is adding this functionality to hostfxr_get_runtime_delegate a bad idea?

Here are my initial comments on using hostfxr_get_runtime_delegate for this concept. For the coreclr_create_delegate functionality specifically, there was no extensibility parameter built into it so it would be quite ugly as well.

What is the next step for this? It seems like a decision needs to be made about whether to add a new api or use the existing one.

I think I'll have to write up the detailed proposal including some prototyping and so on.

I still don't think we need to add new APIs for this. I do agree that the docs available for these APIs are very bad (well, there are not real docs to begin with), but that's not an argument to introduce a new API. And using coreclr_create_delegate is problematic as noted above (and as you mention it's not extensible, so also not ideal).

One other idea (but I haven't spent the time to think it through thoroughly): Currently we allow calling for example hostfxr_get_runtime_property with a null context. The meaning is - use the "main" context - the first one. I don't see a reason why we would not allow also calling hostfxr_get_runtime_delegate with null context - it would operate on the "main" context as well. It's very similar to allowing it to be called on context created by the "initialize for command line", but it sort of makes it more flexible (I don't have to have the context handle to be able to call this).

I think the bigger API design problem is how to make it nice to specify loading into Default ALC versus secondary ALC and how to correctly describe which one is better when.

If adding new APIs is definitely not going to happen, can you close my PR?

Like I said, I really want this in .NET 5. Please let me know how I can help make this happen. Correct me if I'm wrong, but fitting this into the existing APIs is not really something that can be designed or implemented by external contributors.

I think I'll have to write up the detailed proposal including some prototyping and so on.

@vitek-karas I would like to help with this proposal. I have some ideas that we could employ that might make future extending simpler for external contributors like @rseanhall.

I hate to be that guy that constantly asks for updates, but I don't understand the process here. Should I just keep asking for updates every week until this is put into a milestone?

Sorry for the delay. I finally got to this and spent some time playing with the hosting layers. The result is #36990 which contains basically a proposal for the new scenario.

@rseanhall can you please read through that and make sure it covers your requirements? Any feedback would be more than welcome.

Please note that I can't commit to implementing this in .NET 5 (for resourcing reasons). That said, assuming we get a consensus on the proposed design changes, anybody can go and do the actual implementation. It's also possible that we will find time to work on this ourselves, but I can't promise that.

I stumbled across this issue while fixing a bug in the hosting code from nodeclrhost.

The bug was that the hosted MVC app failed to load its main assembly, which was already loaded to execute the entry point.
The MVC app is hosted using hostfxr_initialize_for_runtime_config. The entry point (assembly.EntryPoint) of the application is called by a static method in a referenced assembly, which is invoked using the function returned by hdt_load_assembly_and_get_function_pointer.

I solved it by setting the following runtime properties (which are explicitly not set for libhost hosting mode in hosting_policy.cpp):

  • APP_CONTEXT_BASE_DIRECTORY
  • APP_PATHS (The only one really needed and defaulted to ???)
  • APP_NI_PATHS
  • APP_CONTEXT_DEPS_FILES

As I do not fully understand why & how this works, i`m researching whether this is the right solution or if this is a misuse of the hosting API.

From my understanding my scenario is also a hybrid between a component & an application, so is this related?

Is the source of my Bug that hdt_load_assembly_and_get_function_pointer does not use the default AssemblyLoadContext?

I don`t want to hijack this Issue, but I do not have enough expertise at the moment to create a new Issue. Also I think that my scenario could help to provide some insight into a similiar use case of the hosting APIs.

@sanosdole Feel free to create a new issue, the info you provided above is definitely enough to start a discussion.

It's hard to tell exactly what's wrong without knowing the details of your setup. I only quickly looked at your code and it definitely could be the secondary ACL problem.

hdt_load_assembly_and_get_function_pointer will always load the specified assembly into a secondary load context (AssemblyLoadContext/ALC). This will use AssemblyDependencyResolver to provide assembly resolution into the secondary ALC. This class in turn relies on .deps.json - in your case it would be NodeHostEnvirontment.deps.json if it exists. If it doesn't exist, then it will basically enumerate all .dll files in the folder next to NodeHostEnvironment.dll. If the dependencies are not available in it, then it won't work.

Since it seems you're trying to run an arbitrary application this is likely to break as that application probably lives in a separate directory. One way to solve this would be to create another AssemblyDependencyResolver and use that to load the app - basically something similar to the "plugin" sample here. Create yet another secondary ALC, configure it with the new AssemblyDependencyResolver which points to the application main assembly (and thus its .deps.json) and load the application into it and run it there. Another approach might be to use the AssemblyDependencyResolver and hookup to resolution events in the default ALC and load the app directly into the default ALC and run it there - that's probably "Safer" from compatibility point of view as pretty much all apps run under default context (and some might have bugs which prevent them working from other contexts - unfortunately some of these bugs exist in the .NET Core framework itself right now).

This would also give you nice isolation of your app from the hosting code.

It's possible that if the scenario discussed in this issue was enabled, then you could use hostfxr_initialize_for_dotnet_command_line to load the target application directly and then just call into it. That might be easier to implement.

Just curious: Will your code ever try to load/run more than one application in a given process?

@vitek-karas

Just curious: Will your code ever try to load/run more than one application in a given process?

Short Answer: This is not an epected use case.

It is not prevented but would be unusual. The main intended use cases are writing a node/electron application using dotnet. The dotnet application is run by a js method runCoreApp and then uses the NodeHostEnvironment package to interact with the host. Usually JS code is just running the dotnet app, and the app decides when it will shutdown.

I used hostfxr_initialize_for_dotnet_command_line before, but after an earlier discussion with you i abonded it for the runtime config version. I had problems with the hostfxr_run_app shutting down early breaking the debugger support, as well as missing a callback that the async entry point of the dotnet app has completed.

Edit:

This is just right now. I could imagine that the project one day will support creating node packages using dotnet. Then we would potentially load multiple assemblies in one process, and use an API like var result = executeStaticDotnet("./dotnet/Assembly.dll", "SomeType", "SomeStaticMethod", a1, a2, ...). But i left this feature for now because of problems like conflicting dependencies and frameworks.

@sanosdole If I understand your app correctly, this is your workflow:

  1. User calls your runCoreApp API from JS pointing to their managed assembly
  2. runCoreApp eventually calls into your native host
  3. Your native host loads the runtime through hostfxr_initialize_for_runtime_config targeting the user's assembly
  4. Your native host calls into the assembly provided by your framework, NodeHostEnvironment, using hostfxr_get_runtime_delegate/hdt_load_assembly_and_get_function_pointer
  5. NodeHostEnvironment calls Assembly.Load to load the user's assembly and call the custom entry point

If so, the isolated AssemblyLoadContext isn't playing a part - the real issue is that you're calling Assembly.Load. That API always loads the assembly into a brand new ALC. You need to create your own AssemblyDependencyResolver to properly load the assembly into the isolated ALC.

However, I think that's the wrong approach because this issue is for your use case. Both of us has a native host that wants to call into our own managed code, which in turn calls into a user provided application.

The difference is that I don't support multiple managed applications. My recommendation is that you don't either. The reason for this is that .NET Core has poor support for dynamically loading code (if you want an example look into creating a custom MSBuild task with complex dependencies), let alone the issue of each application needing to be able to run in the first loaded runtime. From a compatibility perspective, you're going to run into far fewer issues by letting the framework initialize the runtime based on the user's app through apphost mode (hostfxr_initialize_for_dotnet_command_line) instead of libhost mode (hostfxr_initialize_for_runtime_config), and loading the user's app into the default ALC.

If you're interested in how I'm accomplishing this in 3.x, my native host is here and my managed host is here.

the real issue is that you're calling Assembly.Load. That API always loads the assembly into a brand new ALC.

I think you meant to say Assembly.LoadFile. Assembly.Load loads into the "current" ALC - that is the ALC the calling code belongs to. Assembly.Load never creates a new ALC. On the other hand Assembly.LoadFile does what you describe (it loads each assembly into its own ALC). And to finish the list Assembly.LoadFrom always loads into the Default ALC.

These behaviors were kept to maintain at least some level of compatibility with .NET Framework. If you're writing ALC aware code, it's much better to call AssemblyLoadContext.LoadFromAssemblyName or AssemblyLoadContext.LoadFromAssemblyPath directly and be explicit which ALC to use.

His project is calling Assembly.Load with the file path. Looking at the table, you are right. I knew that the byte[] overload creates a new ALC and incorrectly thought that the string overload did the same. However, I ran into the same issue where I was trying to call Assembly.Load from the method retrieved from hdt_load_assembly_and_get_function_pointer and it got loaded into the "wrong" ALC which I guess must have been the default ALC. I don't think the "current" ALC is getting set to the isolated ALC that was created for the component.

The "current" ALC should be determine as:

  • Look at the method calling the Assembly.Load (the one right above it on the callstack)
  • Take the assembly that method comes from
  • Take the ALC that assembly belongs to

If you see different behavior please let us know.

Note: There's a way to change that behavior via AssemblyLoadContext.CurrentContextualReflectionContext, but I assume you're not using that.

Oh, that's what it means in that table by "Inferred from caller". Good to know.

After thinking about it some more, I think this is what's happening. Originally, we weren't setting the APP_PATHS runtime property, the target of hdt_load_assembly_and_get_function_pointer was our own dll without a deps.json, and we were trying to load the user's application from Assembly.Load. In this scenario, Assembly.Load fails to find the dependent assemblies and the loading of the assembly fails. (We should have used AssemblyDependencyResolver to help load its dependencies into the isolated ALC but neither of us knew better at the time.)

To try to fix this, we started setting the APP_PATHS property to the directory with the user's application. This allows the Assembly.Load call to complete successfully. Correct if I'm wrong here, but this actually ends up loading the dependent assemblies into the default ALC.

While typing this out, I've figured out what my issue was. I originally wasn't calling Assembly.Load, I was calling AppDomain.Current.Load. Presumably this is merely a wrapper around Assembly.Load, but this probably breaks the callstack inspection and will always see the default ALC since AppDomain is a system type.

To try to fix this, we started setting the APP_PATHS property to the directory with the user's application. This allows the Assembly.Load call to complete successfully. Correct if I'm wrong here, but this actually ends up loading the dependent assemblies into the default ALC.

Correct.

While typing this out, I've figured out what my issue was. I originally wasn't calling Assembly.Load, I was calling AppDomain.Current.Load. Presumably this is merely a wrapper around Assembly.Load, but this probably breaks the callstack inspection and will always see the default ALC since AppDomain is a system type.

class AppDomain
{
    public Assembly Load(string assemblyString) => Assembly.Load(assemblyString);
}

Unfortunately also correct (I which we could just delete most of the assembly loading functions available on various classes as most of them are broken in subtle ways).

I'm not even sure we can actually fix this on the AppDomain - as that will be a breaking change (somebody might rely on the fact that it effectively always loads to Default).

For AppDomain that's really unfortunate because that was the correct way to load the assembly into the right context in .NET Framework.

Search for "Inferred from caller" in https://docs.microsoft.com/en-us/dotnet/core/dependency-loading/loading-managed.

To be honest even if we fixed the AppDomain.Load it would never load into the "right" context. The intent was to load to the context of the calling code, but that assumes that the logically calling code is also physically the immediate caller. Which is often not the case (case in point - AppDomain.Load) - lot of user code wraps assembly loading into utility functions which often live in separate assemblies which may be loaded into different load context.

Just to clarify the nodeclrhostscenario:

  1. I believe it uses the .deps.json file of the app by calling load_assembly_and_get_function with the full path to the app (entry point) assembly. Only the method it executes is from a managed assembly the app depends on.
  2. The delegate calls Assembly.Load with the assembly name (No path, no file extension) to load the app assembly and invokes the assembly.EntryPoint method.

This works usually fine. But when testing with a MVC app, it failed when attempting to register types from the app assembly itself. This was due to Assembly.GetEntryAssembly() returning null and APP_CONTEXT_BASE_DIRECTORY being empty. After setting APP_CONTEXT_BASE_DIRECTORY it failed to load the Assembly as it used Assembly.LoadFrom or LoadFile (not sure about this).

It was all OK once APP_PATHS were set.

I`m still not sure if this is an issue. We are basically a hybrid between an application and a plugin, so maybe it is totally fine that we have to set those runtime properties ourselves. I'm just pouring my scenario as input for further development of the hosting APIs, as well as trying to achieve a better understanding of the hosting model.

It's all fine up to the point where something calls Assembly.GetEntryAssembly which for "component" scenarios doesn't make sense (it correctly returns null as there's no clear entry point in this case). ASP.NET or really dependency injection doesn't typically play nice with secondary ALCs mostly because they were not written with ALCs in mind, so they make assumptions which are not always correct. APP_CONTEXT_BASE_DIRECTORY is a very similar problem to the above, again for components it's not defined really.

I think that for these kind of scenarios the managed code should be loaded "like an app", the only difference being that it won't be executed like an app.

@vitek-karas Can you set the labels, milestone, etc?