dotnet/linker

Out of memory on Github Actions when publishing an application for Android with trimming enabled

pekspro opened this issue · 18 comments

If I’m trying to publish a .NET 7 MAUI application, targeting Android, with trimming enabled on GitHub actions, it runs out of memory:

D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.Graph\bin\Release\net7.0\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.Graph.dll
  Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider -> D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider\bin\Release\net7.0\Pekspro.RadioStorm.Settings.SynchronizedSettings.FileProvider.dll
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
ILLink : error IL1012: IL Trimmer has encountered an unexpected error. Please report the issue at https://github.com/dotnet/linker/issues [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Fatal error in IL Linker
  Out of memory.
Error: C:\Users\runneradmin\AppData\Local\Microsoft\dotnet\sdk\7.0.100\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(86,5): error NETSDK1144: Optimizing assemblies for size failed. Optimization can be disabled by setting the PublishTrimmed property to false. [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
  Fatal error in IL Linker
ILLink : error IL1012: IL Trimmer has encountered an unexpected error. Please report the issue at https://github.com/dotnet/linker/issues [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Out of memory.
Error: C:\Users\runneradmin\AppData\Local\Microsoft\dotnet\sdk\7.0.100\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(86,5): error NETSDK1144: Optimizing assemblies for size failed. Optimization can be disabled by setting the PublishTrimmed property to false. [D:\a\RadioStormBuild\RadioStormBuild\Source\Pekspro.RadioStorm.MAUI\Pekspro.RadioStorm.MAUI.csproj::TargetFramework=net7.0-android]
  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
Error: Process completed with exit code 1.

I’m using the default Windows runner that has 7 GB of memory:
https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

With trimming disabled, it works fine. I could publish with trimming enabled on my local computer that has a lot more memory.

The source code for the application is available here:
https://github.com/pekspro/RadioStorm

I also have an older version of the application targeting .NET 6. I have been able to trim this in GitHub without running out of memory.

It’s not a trivial app, but also not a very large one, I think. I just wanted to report this. I do not expect any solution for this :-)

Looks similar to #3119 thought 7GB is a lot.

/cc @vitek-karas

Might be worth trying to pass /maxcpucount:1 to dotnet build as a workaround to make sure it only tries to trim one app at a time.

Thanks @akoeplinger, that is an interesting idea. I'm trying that right now.

/maxcpucount:1 didn't help. It run for about 90 minutes and then it run out of memory.

If it ran for 90 minutes "in the linker" (or anywhere in msbuild) then this is definitely a bug. I'll look into this tomorrow (sorry, too late today).

I tried /maxcpucount:1 in .NET 6 as well. Took about 20 minutes in Github Actions. Without this parameter, I think it is about 15 minutes.

Tried it locally - it is really bad in 7.0:

Publishing the Android project spins up 4 linkers each of which takes 25 minutes to run on my machine (which is reasonably fast, so this is really bad) and they each consume 7-10 GB of memory (it oscillates in this range, so my guess is that it needs around 7GB, the rest is GC doing its thing).

When I reran only one of the illink invocations it is a bit faster - around 17 minutes, but still really slow. And the memory consumption was as high as 11 GB (but there was less memory pressure on the system, so GC didn't need to work as hard).

Using latest linker from main (8.0) it is much better - I reran one of the illink invocations and it took only 1 minute and used a little bit over 4 GB of memory. So it might still not work in 7GB limited environment if running 4 at once, but it's definitely a LOT better.

This should be fixed by #3094 once it's merged into 7.0 and shipped. I tried locally with a build from #3094 and it took less than 1 minute and consumed max 4.5 GB of memory.

Using latest linker from main (8.0) it is much better - I reran one of the illink invocations and it took only 1 minute and used a little bit over 4 GB of memory.

That's still quite a lot, any pointers into what is taking that much?

I havened looked at the memory consumption details. I'll leave that to @jtschuster (I'll send you the simple repro offline).

When running the repro on my machine, the process only ever got up to about 3 GB, but I was using a different version of the Android sdk.

It looks like Cecil Instructions, the MarkStep._methods queue, and ParameterDefinitions take up the most space in our heap.

Type Size (mb)
Instructions 300
_methods (Queue<ValueTuple<MethodDefinition, DependencyInfo, MessageOrigin>>) 110
ParameterDefinition 100
MethodDefinition 80
GenericInstanceType 60
MethodReturnType 50
ParameterDefinitionCollection 44
MethodReference 42

It looks like Cecil Instructions, the MarkStep._methods queue, and ParameterDefinitions take up the most space in our heap.

It could be also useful to check if we could "free" some memory earlier.

Another idea is to process the methods in different order. If you imagine how typical programs work:

  • You start at Main
  • This calls couple of other high-level methods
  • Which call another set of slightly lower-level methods
  • And so on, until it reaches the low-level methods

Each level is bigger in size (number of different methods) than the one above it, so it looks like a tree. There will be direct calls to low-level methods spread around there as well, but they're not the ones which hurt. Since we use a simple queue, this will effectively do a breath first walk of the tree - where it basically keeps the entire lower level of the tree in the queue before it gets to it (probably more than just one actually).

Another way to look at it:

  • High-level methods are "Expensive" because they bring in a lot of dependencies -> put lot of things into queue.
  • Low-level methods are "Cheap" because they don't have dependencies -> they put few (if any) items into the queue.

Our current algorithm effectively prioritizes processing high-level methods over the low-level ones (see the tree description above). This makes the queue long.

I prototyped this real quick by changing the Queue to Stack (we could probably do better than that still) and for hello world the max size of the collection were:

  • Current main - 3735
  • With #3139 - 833
  • With Stack and #3139 - 805

I would expect the effect to get bigger for larger apps.

Another thing to check would be if we keep MarkStep around after it's done. In theory the driver should free steps which are processed, but it's a complex system, so hard to tell.

I would expect the effect to get bigger for larger apps.

Unfortunately, with #3139, _methods is about the same size as a queue or stack in the repro for this issue (~14mb).

I had seen this issue with my app NewsReader when I was using MAUI 7 (.NET 7).

After upgrading to MAUI 8 (.NET 8) I have re-checked this issue again - now the GitHub Actions for Android in Release mode are working again.

Note: It is still much slower than the build for iOS or Windows (WinUI 3).

I can confirm this is no longer an issue in .NET 8. Closing this.

@vitek-karas this is an issue for us in .NET 8. We have locally hosted, dedicated agents with 10gb of memory. It's failing inconsistently around 50%), but here's a screenshot of the memory usage when it's building:
Image

Here are the logs from our Azure DevOps pipeline run:

  Optimizing assemblies for size may change the behavior of the app. Be sure to test after publishing. See: https://aka.ms/dotnet-illink
  Optimizing assemblies for size. This process might take a while.
##[warning]Free memory is lower than 5%; Currently used: 95.19%
##[warning]Free memory is lower than 5%; Currently used: 95.27%
##[warning]Free memory is lower than 5%; Currently used: 95.22%
##[warning]Free memory is lower than 5%; Currently used: 95.13%
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.05%
ILLink : error IL1012: IL Trimmer has encountered an unexpected error. Please report the issue at https://aka.ms/report-illink [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
  Fatal error in IL Linker
  Out of memory.
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(87,5): error NETSDK1144: Optimizing assemblies for size failed. Optimization can be disabled by setting the PublishTrimmed property to false. [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018: The "ILLink" task failed unexpectedly. [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018: System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Regex11_TryMatchAtCurrentPosition(RegexRunner, ReadOnlySpan`1) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Regex11_Scan(RegexRunner, ReadOnlySpan`1) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at System.Text.RegularExpressions.Regex.RunSingleMatch(RegexRunnerMode mode, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at System.Text.RegularExpressions.Regex.Match(String input) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Shared.CanonicalError.Parse(String message) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.TaskLoggingHelper.LogMessageFromText(String lineOfText, MessageImportance messageImportance) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.ToolTask.LogEventsFromTextOutput(String singleLine, MessageImportance messageImportance) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.ToolTask.LogMessagesFromStandardErrorOrOutput(Queue dataQueue, ManualResetEvent dataAvailableSignal, MessageImportance messageImportance, StandardOutputOrErrorQueueType queueType) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.ToolTask.HandleToolNotifications(Process proc) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.ToolTask.ExecuteTool(String pathToTool, String responseFileCommands, String commandLineCommands) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.Utilities.ToolTask.Execute() [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.BackEnd.TaskExecutionHost.Execute() [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]
C:\Users\Build\.nuget\packages\microsoft.net.illink.tasks\8.0.8\build\Microsoft.NET.ILLink.targets(134,5): error MSB4018:    at Microsoft.Build.BackEnd.TaskBuilder.ExecuteInstantiatedTask(TaskExecutionHost taskExecutionHost, TaskLoggingContext taskLoggingContext, TaskHost taskHost, ItemBucket bucket, TaskExecutionMode howToExecuteTask) [C:\A\w\2200\s\Features\MyEbms\MyEbms\Mobile\MyEbms.Mobile\MyEbms.Mobile.csproj::TargetFramework=net8.0-android]

It's not that uncommon for the trimmer to use ~1GB of RAM if the app is really big (it has to load the entire app and build non-trivial data structures around it). Based on the above screenshot there are 4 trimmers running in parallel at this point - so on a 10GB limit, I could see the system running out of memory.
I think the simplest solution would be to reduce MSBuild parallelism to avoid running multiple trimmers in parallel.