dotnet/diagnostics

[gcdump] objects are sometimes missing from Android dumps

jonathanpeppers opened this issue · 8 comments

Description

I was working on this fix: dotnet/maui#18584

Before I solved it, I put a breakpoint before/after this line:

https://github.com/dotnet/maui/blob/b170fbb7d32c1812919612fda3921cedff4cccc1/src/Controls/tests/DeviceTests/Elements/NavigationPage/NavigationPageTests.cs#L333

In the debugger, I could look at pageReference.Target.Content.Children and see the CarouselView.

Yet it did not show up in the dumps:

The scenario was:

  • Some Android global listener was keeping CarouselView and other objects alive
  • Unit test fails, showing this
  • gcdump file is missing CarouselView and other objects that should be present

Configuration

  • .NET MAUI running on Android
  • .NET 8

Regression?

No gcdump files are new for Mono in .NET 8.

Other information

I added a trace provider of Microsoft-Windows-DotNETRuntime:1980001:5 for GCHeapSnapshot and saved .nettrace file as well:

gcdump-missing-android-2.zip

With some folks moving around .NET teams, I filed this mainly so we don't lose it. We need to try to get a smaller repro to solve this, so you wouldn't have to build MAUI to repro it.

I think @mdh1418 might be someone that works on this eventually?

/cc @PureWeen if we can reproduce in a smaller app.

@mdh1418 are you more familiar with this side of things? I'm not sure how mono events work wrt the GC

I haven't worked with gcdump before. I'd have to try it out. @jonathanpeppers you mentioned its new for mono in .NET 8, do you know if gcdump has worked for mono in other contexts?

In .NET 8, that was the first release dotnet-gcdump works for Mono anywhere, but it works for Android, iOS, MacCatalyst, etc.

Ok, we have a repro: TableViewLeak.zip

I did this on an Android device (but emulator would also work), here are the steps:

  • Add <AndroidEnableProfiler>true</AndroidEnableProfiler> to the .csproj enable the Mono diagnostics component
  • Run dotnet-dsrouter android
  • adb reverse tcp:9000 tcp:9001 (forwards port for device)
  • adb shell setprop debug.mono.profile '127.0.0.1:9000,nosuspend,connect' equivalent of $DOTNET_DiagnosticPorts
  • Run the app
  • dotnet-gcdump ps - find PID of dsrouter
  • dotnet-gcdump collect -p 1234 where 1234 is the PID

While in the app, I pressed:

  • Push Page
  • Push Page
  • Check Result

Taking a couple gcdumps along the way.

The app itself holds a WeakReference to LeakPage and then verifies it's gone during Check Result. The app shows the object is alive, but I can't find it in any dumps:

missing-leak-page.zip

Ok, we have a repro: TableViewLeak.zip

I did this on an Android device (but emulator would also work), here are the steps:

  • Add <AndroidEnableProfiler>true</AndroidEnableProfiler> to the .csproj enable the Mono diagnostics component
  • Run dotnet-dsrouter android
  • adb reverse tcp:9000 tcp:9001 (forwards port for device)
  • adb shell setprop debug.mono.profile '127.0.0.1:9000,nosuspend,connect' equivalent of $DOTNET_DiagnosticPorts
  • Run the app
  • dotnet-gcdump ps - find PID of dsrouter
  • dotnet-gcdump collect -p 1234 where 1234 is the PID

While in the app, I pressed:

  • Push Page
  • Push Page
  • Check Result

Taking a couple gcdumps along the way.

The app itself holds a WeakReference to LeakPage and then verifies it's gone during Check Result. The app shows the object is alive, but I can't find it in any dumps:

missing-leak-page.zip

sry for @ you, @jonathanpeppers
Would you please share the TargetPlatform configuration of your csproj?
We went through your dev blog and the xamarin android profiling guide trying to gather gcdump of an embedded (android) mono application, with no luck. The produced gcdump looks like is of the dotnet-dsrouter itself, other than the dump from the android device.

We added the -p:AndroidEnableProfiler=true configuration to build command, and we launch the build with .net 8.0.1-windows sdk, package with .net 8.0.1-android runtime later.
When the wrapper application (UnrealEngine, specifically) launched, we started the embedded monovm with environment "MONO_DIAGNOSTICS=--diagnostic-mono-profiler=enable" and "DOTNET_DiagnosticPorts=some_ip:some_port" as an engine plugin. It seems the only missing part is the "-f net8.0-android" configuration when building the mono c# project.

dotnet-dsrouter and dotnet-trace is working with this setup, we can get tracing data and gcdump according the xamarin guide with help of mono-gcdump (the .net 7 way), while dotnet-gcdump is only able to get the dump of dotnet-dsrouter other than the embedded monovm. (p.s. We have to change 127.0.0.1 to the usb-connected PC's ip address, which is also different from the guides).

It is because we missing some configuration, or this setup is not working for embedded mono yet?

@AlexeiNaabal I would try a dotnet new android project template to see if you can get that to work.

Once that's working you might see if these docs help your Mono embedding scenario:

Offhand, I wonder if you have libmono-component-diagnostics_tracing.so, as that's required for it to work at all.

@mdh1418 found an important box! Show dead objects

image

We are able to track down the leak, and find the missing objects. I think we can close this, unless someone thinks something here is still "wrong". Thanks!