Update ActiveX and COM interop
Opened this issue ยท 18 comments
@awakecoding As discussed over at dotnet/runtime#66674 (comment) you were looking into using the new COM interop and being held back by the lack of TLB support. The TLB tooling however isn't critical, it's always been possible to write the interface declarations manually.
I did a very quick experiment (https://github.com/weltkante/MsRdpEx/tree/srcgen) to see into what problems I'm running and to give you a little bit of early feedback. I mostly copy/pasted the declarations from ILSpy and fixed issues as they presented, instead of properly porting the IDL of the TLB. Code not required for the simple scenario of setting up a RDP connection is also commented out.
Its not my intention to keep it it this way, next step is doing it properly, but it'll take some time and with holidays in between its nicer to show some early results instead of just waiting.
Things to note so far:
- IDispatch is required for event handling and not provided by the source generator, but can be implemented manually. In the current state I just used old COM interop for the event handler interface to get the IDispatch support.
- AxHost itself is still using the old COM interop, the documentation claims to support the code generators are supposed to be castable into old-style interfaces, but I couldn't get it to work. As a workaround I did manually conversion between the two styles of interop. Will have to research a bit more and maybe ask someone from the runtime team whats up with that, but its not much of an issue since a workaround is possible.
- WinForms itself is still using old COM interop for accessibility, so in no case you'll be able to disable it entirely in .NET 8. Not sure how that affects Native AOT, I've seen people work on enabling Native AOT for WinForms in some scenarios so I assume it might still work. If not thats something to work with the WinForms team on improving. (I haven't tried Native AOT on the example code yet.)
- Being source-compatible with the old generated library isn't possible, the source generators don't support properties so they have to be written as methods instead. Also as you already noted, the legacy tooling is sometimes generating APIs that are not possible to write in C# source. Personally I don't think its a big problem as I prefer handwritten declarations over the old tooling anytime, and its a one time effort to switch the library. Let me know what you think.
- I noticed your intermediate library claims to be netstandard but has nonstandard dependencies, so is a bit of a hack. You'll want to move away from building a fake netstandard package and do a proper multitarget package because a .NET 8 source generated library won't be usable for Desktop Framework, so you can no longer have one DLL for all frameworks. Fixing the multitarget issues will require some changes to the repo build process, as currently everything outputs into a single folder regardless of target framework, so if you multitarget the different targets overwrite each other.
For now I'll work on just .NET 8 support on the branch but if you intend to keep supporting Desktop Framework (and/or older .NET Core versions) using the same handwritten API then you'll have to figure out how to do the necessary changes to support multitarget builds (shouldn't be hard). From my side it'll not be hard to write them in a way thats consumable from either target framework. I can put the interop declarations into your AxInterop.MSTSCLib as I've been doing so far, or provide the entire API as a separate nuget package external to this repo, I don't really have any preferences, you can let me know later what you prefer.
First, thanks a lot @weltkante for taking the time to look into this!
I just built your branch and I'm surprised to see it's already minimally functional with everything rebuilt from source. It's a shame we can't build an API 100% compatible with the original, since backward compatibility is quite important here. We have a lot of code in Remote Desktop Manager that wraps the original Interop interface, and this is the case for pretty much all projects integrating the RDP ActiveX, most of which have their own copy of MSTSCLib.dll, AxMSTSCLib.dll, with or without the "Interop" namespace. Searching for "MSTSCLib.dll" in GitHub is a testament to how pervasive the RDP ActiveX interface is in practice.
This being said, I have refrained from wrapping the generated RDP ActiveX too much in MsRdpEx so far, such that it's still mostly used as a nuget package with clean, up-to-date Interop DLLs + MsRdpEx.dll API hooking for extensibility. Since the newer ActiveX/COM Interop would likely not be generated, we can do it cleanly in another namespace. This way, the original AxInterop.MSTSCLib.dll and Interop.MSTSCLib.dll could still be provided in the nuget package, but I'll find a way to make it a conditional import. My thinking is we can put the Interop code under a separate namespace, probably inside the Devolutions.MsRdpEx.dll assembly, otherwise as Interop.MsRdpEx.dll. I'm open to both approaches and undecided yet, maybe you have a preference for one or two assemblies here. As for AxInterop versus Interop, it always bugged me that we needed two separate assemblies for Interop code, I'm fine with just one.
As for the netstandard2.0 hack, you are correct - there was no simple way to mean "all .NET on Windows + WinForms" which is truly what it is. Interop.MSTSCLib.dll has an hardcoded reference to net40 (!) which throws a warning when imported as "netstandard2.0", but if you add a direct assembly reference to it instead, modern Visual Studio build environments fail to build because it thinks a true net4.0 SDK must be present. This is quite annoying and I'd like just to flip the bytes to make it net48 instead, but I couldn't figure how to do it post-build. I want to keep a clean copy of the original Interop DLLs for compatibility, Microsoft still makes update to the RDP ActiveX interface with some major OS releases, and I'd like to regenerate and update with tlbimp.exe, aximp.exe when that happens, and use it as reference for the modern ActiveX/COM Interop wrappers.
Here's what I suggest: multi-targeting to net48 and net8.0-windows, with conditional compilation options in the code. We just need a good strategy to make it manageable for everything that depends on net8.0 which can't be built for net48. When importing the net48 version of the package, none of the newer COM Interop would be available, only the "classic" stuff. We literally switch from net48 to net8.0-windows last week in the develop branch of Remote Desktop Manager, we will need the old Interop code for quite some time, even in .NET 8. We would only make the switch to the newer Interop code when it is exhaustively implemented and stable, as it is a critical component for us. We can also assume that everyone currently wrapping the RDP ActiveX interface in .NET 8 is using the only Interop code available (old one).
For a .NET 8 project that doesn't bring the old Interop DLLs along, I'll look into adding a condition and a .targets file in the nuget package, such that one could avoid bringing the dependency that is not compatible with Native AOT compilation.
One other question: my understanding is that with the new COM wrappers, it is not possible to define get/set properties on the Interface, which is a bit annoying. This would obviously cause a lot of changes in the consumer code because of this limitation, if we want to build the COM interop code from source. Do you know if there's a design pattern we could implement to wrap the raw interface with a C# class? For Instance, CMsRdpClientAdvancedSettings to wrap IMsRdpClientAdvancedSettings, where CIMsRdpClientAdvancedSettings is a class wrapping the IMsRdpClientAdvancedSettings interface with C# properties. I could see benefits in going beyond what raw COM interfaces can do, even if most of the code would be extremely repetitive. Maybe there's a way to use .NET 8 source generators to help with this?
As for the original interface, there was one place where I need to manually redefine things just to pass the LoadBalanceInfo as a raw BSTR without going through string marshalling, as it needs to be encoded as UTF-8 bytes. I have added the workaround with the interface definition here in the Devolutions.MsRdpEx assembly. We should keep in mind that the newer Interop code should be able to accommodate edge cases like this.
FYI, I know the current code is experimental and handwritten, but if you close the session right now, it throws this error:
If there's a way to semi-generate the Interop code, even with just a PowerShell script, it would significantly reduce the potential of human error when writing the newer interface. That's really what scares me the most, as it may work for simple cases, but we'll really want a way to make sure it is fully correct before we make the switch, and avoid destabilizing one of our most important features in Remote Desktop Manager. This is less of a hard requirement for smaller projects though, which can be more tolerant to some breakage as long as most of the interface works well.
Thanks again for your time! It's really refreshing to have someone really dig into ActiveX/COM Interop like this. I wish to have this repository the go-to project for the RDP ActiveX Interop, with a clean migration path from the older interface to the new.
Thanks for the comments, I generally understand and agree with you, to not make this overly long I'm just responding to things I have a point to make, if you want to discuss something specific I left out feel free to point it out.
Do you know if there's a design pattern we could implement to wrap the raw interface with a C# class? [...]
Yes, generation of wrappers may be more achievable than generation of interop, since it usually just passes arguments through and doesn't need to transform them. Its one of the options I'm considering, will have to see if I can come up with something simple enough to do myself, but if not its also something someone else can look into more easily than writing or generating interop itself.
[a generator] [...] would significantly reduce the potential of human error when writing the newer interface [...]
As far reliability/confidence is concerned, my suggestion would be not to focus on generating the code but to do unit tests comparing the manual written source - or rather, the compiled classes - against either the old interop assembly or even directly against the TLB. I can contribute some of this testing after I'm done with the initial pass through the interfaces and they can be extended anytime if someone feels the need for more.
That's really what scares me the most, as it may work for simple cases, but we'll really want a way to make sure it is fully correct before we make the switch, and avoid destabilizing one of our most important features in Remote Desktop Manager.
Overall I agree that its important to not destabilize the existing code/product, like you said, the new interop will need attention to make sure (as much as reasonably possible) that its correct - but it'll not be possible to guarantee a switch between technologies without friction as there may be yet undetected bugs in callers, or implicit dependencies on safeguards or behavior that have changed or been removed in the framework itself.
You are probably aware of it, but COM interop is very "dangerous" in that its easy to get semantics and responsibilities wrong. What you might not be aware is that the Desktop Framework interop implementation was never fully correct/stable either, there's always a trade-off between convenience, performance and safety against mistakes. The new implementation is taking different trade-offs in some places, so you will not get full equivalence at the runtime level either, not just at an API level.
For example releasing COM objects via the garbage collector was never possible to do in a perfectly stable way yet most people don't bother to do proper ownership management (even though it was technically possible) until they find themselves in a scenario where its absolutely needed, because its so much more convenient to ignore it. COM allows objects to assume being released on the apartment they were created on, which is something the framework cannot always arrange. Desktop Framework did a "best effort" to properly release references on their original apartment, as far as I can tell from docs and code the new generator does spend less effort on this, leaving more responsibility to the developer if they are working with objects that are "apartment sensitive" in that particular aspect. (I have no idea if thats relevant for the particular API we are working with here.)
My suggestion is to just take this as an experiment for now, see how far we get, and then make your own tests with your private code to be able to make an informed decision whether its worth to actually make the switch. The new interop definitely has advantages, but there are also risks that can't be avoided and need to be handled properly through testing of each consuming codebase using the new interop style. If the risks appear too high its always possible to give feedback to the runtime team or wait a release for the new infrastructure to stabilize and perhaps new features arrive, before evaluating the situation again.
Here's what I suggest: multi-targeting to net48 and net8.0-windows, with conditional compilation options in the code.
Should be possible to work out something as we go, like I said, the main problem at this point is that multi-targeting is not possible because all target outputs go into the same directory. Needs some restructuring of how the project is built, thats all. What needs to be done beyond that to build a decent package I can't tell yet, but you seem to have some ideas.
[...] if you close the session right now, it throws this error
Yeah, noticed this as well but hadn't figured it out when I did the first push, has been fixed now, was missing a conversion between interop styles in the CreateSink method and the exception there did not break into the debugger so I didn't immediately notice it. We'll have to determine later what the risk of these kinds of mistakes are, they happen when passing or retrieving interop references to/from WinForms, as they still use the old style interop. AxHost seems manageable via workarounds, but if references are exchanged with WinForms in a lot of other client code it may need improvements in WinForms, we'll have to see.
(one of my open points) [...] the documentation claims to support the code generators are supposed to be castable into old-style interfaces, but I couldn't get it to work.
Figured this one out, this particular behavior is off by default and has to be opted-in via a project file setting. The catch is that it doesn't work with Native AOT, so I'll keep the manual conversions since AOT mode is of particular interest to you.
I'm done with the first round through the TLB and got all interfaces written down, not pushed yet, will clean it up a bit and probably push it for the weekend. Also created a WinForms issue to track what needs to be improved there to get the prototype fully working with COM source generators.
As it stands currently, the controls event interface cannot be source generated in .NET 8 as far as I can tell, WinForms simply seems to reject it, even if it implements both source generator and classic interfaces. We'll get the bindings mostly source generated and I'll do PRs on WinForms to improve its support for a fully generated application for .NET 9, we'll have to see how much problems come up on fixing their bugs.
As for the original interface, there was one place where I need to manually redefine things just to pass the LoadBalanceInfo as a raw BSTR without going through string marshalling, as it needs to be encoded as UTF-8 bytes.
I've looked closer at BSTR interop and it seems to support both text content and binary content, while .NET just supports text content marshalling. This is a bit disappointing because it means you cannot use standard marshalling for the cases where the API wants to pass binary data through a BSTR parameter. The question is, if its even possible to identify all methods that might want to pass binary data through a BSTR - if not then my personal preference would be to just not use the standard marshalling at all for any BSTR and do a custom type instead of strings. There is a bit of design space and trade-offs to chose when deciding how the APIs should look like, you still want to be able to easily pass in or retrieve normal strings of course. I'm currently prototyping this along the rest of the interfaces but whether or not to do it, or how exactly the replacement type should look like and behave, can still be looked at later. In particular there's a trade-of between performance and safety against incorrect use to make which I'll explain later when I pushed.
After pushing the interfaces for the weekend, next week I'll be looking at compatibility with the old interop dll (some kind of generator I guess, hopefully simply just forwarding arguments between the two APIs).
Regarding BSTR marshalling - LoadBalanceInfo is really the only edge case I am aware of in the RDP ActiveX interface. I'm fine if we have to use something special to make it work - I used an alternative COM interface definition just to get the BSTR pointer as an IntPtr, but I guess we could also try using MsRdpEx extended COM interfaces. However, we then rely on the MsRdpEx API hooking to pass a BSTR which is supported in the regular public interface, and I don't really like that.
I get the feeling that there is no shortcut for COM Interop - we'll need a way to replace the accessors in the interfaces, and a way to pass that BSTR as an IntPtr
I get the feeling that there is no shortcut for COM Interop - we'll need a way to replace the accessors in the interfaces, and a way to pass that BSTR as an IntPtr
You can just define your own class, including marshalling, you'll see when I push tomorrow. I did a BinaryString
class/struct that manages the lifetime, is disposable etc. while implicit cast operator handles the conversion from both string and char/byte spans so it transparently supports all usescases BSTR has (no need for callers to change how they call it). Making it disposable/finalizable allows for safe lifetime management (much better than a plain IntPtr but has a small performance cost because to have a finalizer you have to do an additional heap allocation you don't have with string/IntPtr). If you can "trust" all callers to know what they are doing you can drop some of this convenience/safety and get more performance/less allocations but mistakes will be more fatal or harder to notice.
These custom rules of course deviate from standard interop (and the existing interface signatures) so it slightly increases friction.
While transcribing the interface it looked like a few more strings may be using binary data, maybe you don't use those APIs, maybe they are just normal strings. If we end up not needing custom BSTR handling except in that one place thats easy to change back to standard string APIs, we'll just have to see and try what we think works best.
- pushed the current state to the branch
- started doing unit tests, comparing the API to the original library (did help finding/fixing a few initial mistakes)
- the required mappings are surprisingly simple
- some compatibility mappings since COM source generators don't support IDispatch/Variant/bool (yet)
- BinaryString is just there because of the experiment to support both string and binary data, could be dropped back to string
- there are two bugs in the automatically generated assembly, see the last three entries in the mapping function, they have links to the docs (which differ from what the generated assembly does for these functions)
- the two event interfaces are still classic COM, they need bugfixing in WinForms before I can switch them
- I probably can get a source-compatible (minus the two API bugs) compatibility layer generated next week
@weltkante I'll be back from vacation tomorrow, and I'll next to fix an issue with .NET Framework UTF-8 string conversion in bindings that aren't part of the RDP ActiveX Interface + make a new release of MsRdpEx. It's annoying, but [MarshalAs(UnmanagedType.LPStr)] breaks UTF-8, and the proper UTF-8 support only came in after .NET Framework. I have to maintain and update MsRdpEx for Remote Desktop Manager 2023.3 which is still using .NET Framework 4.8, but even after we've switched to .NET 8 in Remote Desktop Manager 2024.1, I think we'll need to keep .NET 4.8 support for a while. Just recently I've used MsRdpEx in a decompiled Hyper-V Manager replacement.
I'm surprised you could find bugs in the original generated assemblies, I guess nobody noticed them for all that time? I'm amazed at how deep your knowledge is in COM Interop and .NET, keep up the good work. I'm learning a lot just by watching the progress you're making on your branch. Once we have a solution for source-compatible bindings, it should become a lot easier to start merging things.
Speaking of which, do you have a list of changes to MsRdpEx that could be done to help prepare it to receive your changes, given the requirement of backward compatibility with .NET Framework and the old COM Interop? I've mentioned multi-targeting and conditional compilation before + a .targets file for conditional import of specific assemblies in the consumer, this is something I could give a try after I'm done fixing the UTF-8 string conversion issues mentioned earlier. I don't want to be a blocker for this amazing work.
[MarshalAs(UnmanagedType.LPStr)] breaks UTF-8
Seems obvious to me, that mode uses the encoding for whatever codepage the OS is configured to use (may be set UTF-8 on modern Windows, but usually it isn't defaulting to that yet). Personally I'd always marshal UTF-8 as byte arrays/pointers/spans. LPSTR marshalling should only be used for APIs that expect strings in the current Windows ANSI codepage.
I'm surprised you could find bugs in the original generated assemblies, I guess nobody noticed them for all that time?
The bugs would trigger if you use NotifyRedirectDeviceChange
on a 32 bit machine with the interop DLL generated from the 64 bit TLB (or vice-versa), or if you try to make use of the HWND in the RemoteWindowDisplayed
event. Maybe nobody needs to do that, or those that do have workarounds, like you have been using the custom interface definition to be able to pass UTF8 through a BSTR.
Speaking of which, do you have a list of changes to MsRdpEx that could be done to help prepare it to receive your changes, given the requirement of backward compatibility with .NET Framework and the old COM Interop?
Changing the project to support multitargeting would be pretty high up the list because its holding me back from adding support for Desktop Framework. Like I said, I don't mind writing it in a way that can support both once the project can support it. It'll probably need a bit of experimentation to find the sweet spot between compatibility and supporting better APIs where available.
Beyond that I currently don't yet have anything else that is important, but once the first pass of the compatibility layer is generated (if all goes well by end of next week) I'll probably turn to debugging WinForms and see how to improve things on their side. You could see if you have any feedback on the compatibility layer and/or APIs at that point.
Also don't worry if you shouldn't be able to get things done in time, we can always take a break and come back to it later.
@weltkante I got my UTF-8 issue fixed earlier today - I think what happened is that I broke system code page support when I modified my native code to assume UTF-8 strings everywhere.
I've started a new branch to work on multi-targeting to net48 and net8.0-windows here, maybe you have comments already:
https://github.com/Devolutions/MsRdpEx/tree/multi-target
took a quick look and seems to work, will try moving the code over and see if I can do multitarget declarations
@weltkante I've got a pull request for initial multi-targeting, if it's all good with you, I'd merge that upstream: #101
Your choice if you want to merge or wait, I don't see anything wrong with it so far. I'll need a day or two to get everything up and running, can do that on the branch as well if you don't want to merge yet. Just rebased my code on top of the branch and got .NET 8 to compile, but of course Desktop Framework needs a bit more work to run off the same codebase (instead of just conditionally compiling the new files only for .NET 8) and I might wait a bit to get it working with the compatibility layer right away.
@weltkante I merged it, I just wanted to remove some of the noise. I expect more changes on top of this first pass as you work your way through it, but that's fine, at least we've got initial multi-targeting now
Got multi-targeting working on a new branch (https://github.com/weltkante/MsRdpEx/tree/srcgen2), mostly its just generating the classic ComImport-based APIs via a script. I've reverted the ActiveX controls using the "new" COM interop for now, since that will only be available on .NET Core and was causing conflicts compiling against Desktop Framework I didn't want right now, will get that integrated from the first branch later. The generator itself is also not pushed yet since I still have to figure out how to integrate it, but its basically just doing reflection on the binary assembly, same as the unit test does, nothing fancy.
The compatibility layer is causing some issues, I've been playing around with various ideas, e.g. something similar to SharpDX using classes instead of interfaces and otherwise having the same members. Being source compatible on members isn't that hard, but handling casting between types is causing trouble since C# doesn't let you do a lot to override casts. Things like SharpDX use helper functions for casting but your existing source probably just uses native casts and so it would be quite annoying having to update all casts in callers.
I've got one more thing I want to try though, IDynamicInterfaceCastable was added for .NET Core to allow reimplementing COM interop in a source generator and it may be possible to reuse the same mechanics for handling casting in a compatibility layer. It'll be a few more days until I have that generated though.
So, at the moment not much to test yet, just wanted to give an update where things are.
pushed an update with the missing pieces in place, that is multi-targeting, code generation and compatibility layer are there and I can run the same sample code with (almost) no source change against either .NET 4.8 or .NET 8
Still a bit early to ask you for feedback (some TODO's still in there, in particular around Variant support) but I thought I'd push early once I got something working again.
The ugly #if conditions around non-MSTSCLib interfaces are there to see how multi-targeting of them would look like, unfortunately some of those definitions aren't easily written in both styles since they need different attributes. Depending on how agressive you want to be in using source generated COM interop you could just leave them in the old style as well, or if you have a lot of them write some codegen to generate the other variant, maybe easier to maintain than all the #if conditions.
pushed another update, more missing parts in place (in particular Variant interop) - going to need a litte bit cleanup now, but the general functionality is where I want to have it. I've also rushed a bit and need to do a little more testing/review to make sure the marshalling works correctly, for now I just verified the example project still works. I'll try to finish the cleanup for the weekend.
conceptual changes:
- no longer need VariantBool structs in most of the places, don't know why the marshalling options don't show up in intellisense but they are still there, thought they were dropped from .NET Core when they didn't show up (though without VariantBool structs the interfaces need more verbose attribute annotations now, as the default behavior still differs from Desktop Framework)
- BinaryString no longer needs to be disposed, storing all of its data in managed format and only converting when crossing to native code, making it much nicer to use (can still drop the whole BinaryString thing from the signature if your testing shows that to be more useful, but from just the example project it seems to do no harm and saves from writing duplicate interface declarations when wanting to pass non-utf16-data through)
- made a similar class for Variant (making it much more flexible and efficient than what the framework does) but ended up not using it. Existing code will use object variables all over the place and there is no nice way to handle typecasts of objects that contain something different from what the classic com marshaller put in there.
While typecasting works with interfaces which are part of the library (and thus generated by the compatibility layer script), you won't be able to cross-cast into externally declared interfaces, so if calling code uses that it will be an annoying and require code edits on the call sites to "unpack" the compatibility layer and access the underlying object.
Overall, while the whole compatibility layer seems to work decently for the sample project and can reduce required code changes to almost nothing, but I expect that .NET 9 (once they support property syntax) will work even better out of the box and make the compatibility layer redundant.
I'll still finish cleaning up the prototype, but it may turn out that just waiting another release is more practical for you, depending on how testing against your codebase works out.
[edit] didn't finish the cleanup for the weekend, will happen next week. Mostly will be moving classes into proper namespaces, double checking visibility, looking at unit tests again.
I'll still finish cleaning up the prototype, but it may turn out that just waiting another release is more practical for you, depending on how testing against your codebase works out.
That would be my preference, but if there's a way to merge things upstream without having to migrate to the newer APIs (and therefore forcing the migration) that would be ideal. Otherwise, we can have as much of the changes merged upstream, with a clean branch of your precious work saved here until we're ready.
Already had pushed the namespace refactoring last week, now finished the code review I wanted to do, didn't see any obvious mistakes so no pushes regarding that. Still should do more unit testing but I'm a bit busy at the moment so those may take a while to get done, but this shouldn't prevent you from starting to look into doing your own tests if you want to. I think I'll put my priorities (once I get some free time) into seeing what I can get improved on the WinForms side. Though of course, if you find anything thats broken, have questions or feedback, I can make some time and take a look.
PS: there's no rush to get this tested or anything, I'm mostly done with what I wanted to achieve in the prototype, so its up to you when and what exactly you want to integrate in your repository and in which form. Like I said, I have no issues leaving this rest and revisit it later. Also if you have any suggestions to change how things work or are named, feel free to do (or ask), I'm not particularly attached to any of the specifics in the prototype, most things were just chosen on a whim to get things working and have no deeper reason.