DataDog/dd-sdk-flutter

[Bug] MissingPluginException upon process restoration (Android only)

btrautmann opened this issue · 7 comments

Describe what happened

We are running into the following issue with our Android app: Upon upgrading to version 2.1.0, we began noticing MissingPluginExceptions being thrown whenever the app process is restored after being terminated by the system. We noticed in that release that a couple class instances were migrated to being singletons. We suspect that this may be playing a role in the issue, but have not yet dug deep enough to verify that this is the case.

When the issue occurs, logs look something like:

[E]  mobile_unhandled_error: GlobalErrorHandler.handleError called  ERROR: MissingPluginException(No implementation found for method removeAttribute on channel datadog_sdk_flutter.logs)
2024-01-19 17:05:29.603  9516-9646  flutter                 com.betterment                       I  #0      MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:308:7)
2024-01-19 17:05:29.603  9516-9646  flutter                 com.betterment                       I  <asynchronous suspension>
2024-01-19 17:05:29.604  9516-9646  flutter                 com.betterment                       I  [D]  SentryService.recordError  ERROR: MissingPluginException(No implementation found for method removeAttribute on channel datadog_sdk_flutter.logs)
2024-01-19 17:05:29.618  9516-9646  flutter                 com.betterment                       I  #0      MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:308:7)
2024-01-19 17:05:29.618  9516-9646  flutter                 com.betterment                       I  <asynchronous suspension>
2024-01-19 17:05:29.659  9516-9646  flutter                 com.betterment                       I  [E]  mobile_unhandled_error: GlobalErrorHandler.handleError called  ERROR: MissingPluginException(No implementation found for method addAttribute on channel datadog_sdk_flutter.logs)
2024-01-19 17:05:29.660  9516-9646  flutter                 com.betterment                       I  #0      MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:308:7)
2024-01-19 17:05:29.660  9516-9646  flutter                 com.betterment                       I  <asynchronous suspension>
2024-01-19 17:05:29.660  9516-9646  flutter                 com.betterment                       I  [D]  SentryService.recordError  ERROR: MissingPluginException(No implementation found for method addAttribute on channel datadog_sdk_flutter.logs)
2024-01-19 17:05:29.661  9516-9646  flutter                 com.betterment                       I  #0      MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:308:7)
2024-01-19 17:05:29.661  9516-9646  flutter                 com.betterment                       I  <asynchronous suspension>

Basically, any method send over the MethodChannel will throw this, including log itself. It's notable that all of our setup logic (where we configure the DatadogSDK) is broken due to this.

It also may be worth noting that our app is a hybrid app, meaning there are a few features that remain in native.

Steps to reproduce the issue:

  1. Enable developer options
  2. Install/side-load the application on an Android device
  3. Set background process limit to No background processes in Developer Options
  4. Run adb logcat | grep flutter to see device logs (we'll want it via adb because backgrounding the app will force the system to kill it and you'll lose logs provided via the Flutter tooling)
  5. Background the app and wait for the system to kill it (Flutter tooling will print Lost connection to device)
  6. Foreground the application and invoke Datadog methods
  7. MissingPluginExceptions occur

Describe what you expected:

The DatadogSDK should be usable after process restoration.

Additional context

  • Dart/Flutter version:
Flutter 3.10.7 • channel unknown • unknown source
Framework • revision e285328a69 (5 months ago) • 2023-08-17 17:55:17 -0700
Engine • revision 077a732ef4
Tools • Dart 3.0.7 • DevTools 2.23.1
  • Android/iOS OS version: Android 14
  • Device Model: Pixel 7/Pro
  • Datadog SDK version: 2.1.x+
  • Versions of any other relevant dependencies:

We will continue to look into why this might be happening, but wanted to open an issue first in case root cause was obvious to you.

Hi @btrautmann, thanks for reporting!

I'll start taking a look into this ASAP.

You said you're using Flutter in a hybrid app - are you initializing the native Datadog SDK in Android then using attachToExisting, or are you only initializing Datadog from Flutter?

I'm going to try as well, but can you reproduce the issue in our hybrid example?

Lastly, can you send me your code that initializes Datadog (either in Android or Flutter) and how you're creating your FlutterActivity or Fragment?

are you initializing the native Datadog SDK in Android then using attachToExisting, or are you only initializing Datadog from Flutter?

We are only initializing it in Flutter. The native side sends logs TO Flutter via a MethodChannel (totally unrelated to Datadog) and Flutter pipes them over to Datadog. We hold these logs in a queue until Flutter signals to native that it's set up and ready to receive them. We do it this way so that once we are done with our native migration we can just sever the connection between Flutter & Native and not need to do any other refactoring of Flutter code.

can you reproduce the issue in our hybrid example

I will give this a shot today! Thanks for pointing me to it.

Lastly, can you send me your code that initializes Datadog (either in Android or Flutter) and how you're creating your FlutterActivity or Fragment?

Initializing Datadog

Note: I've redacted a few more "sensitive" things.

final datadogConfiguration = DdSdkConfiguration(
  firstPartyHosts: [<redacted>],
  clientToken: <redacted>,
  serviceName: '<redacted>',
  env: config.environment.name,
  trackingConsent: TrackingConsent.granted,
  site: <redacted>,
  rumConfiguration: _enableRealUserMonitoring
      ? RumConfiguration(
          applicationId: _applicationId,
          reportFlutterPerformance: true,
          sessionSamplingRate: 100.0,
          tracingSamplingRate: 100.0,
        )
      : null,
);

if (_enableRealUserMonitoring) datadogConfiguration.enableHttpTracking();

await _datadogSdk.initialize(datadogConfiguration);

if (_enableRealUserMonitoring) {
  _datadogSdk.rum
    ?..addAttribute('device_id', deviceId)
    ..addAttribute('app_session_id', sessionId)
    ..addAttribute('app_version', config.version.toString())
    ..addAttribute('build_number', config.buildNumber)
    ..addAttribute('audience', config.audience.name)
    ..addAttribute('platform', platform.name);
}

_logger = _datadogSdk.createLogger(
  LoggingConfiguration(
    loggerName: <redacted>,
    sendLogsToDatadog: config.enableRemoteLogging,
    datadogReportingThreshold: Log.logLevel.toDatadogVerbosity(),
    sendNetworkInfo: true,
  ),
)
  ..addAttribute('device_id', deviceId)
  ..addAttribute('app_session_id', sessionId)
  ..addAttribute('app_version', config.version.toString())
  ..addAttribute('build_number', config.buildNumber)
  ..addAttribute('audience', config.audience.name)
  ..addAttribute(
    'platform',
    platform == SupportedPlatform.ios ? 'ios' : 'android',
  );

This code is invoked in our main.dart file.

FlutterActivity

Our Flutter Activity extends FlutterFragmentActivity and is the entrypoint activity for our application. The only mildly interesting thing about our implementation is that we override configureFlutterEngine as follows:

override fun configureFlutterEngine(flutterEngine: FlutterEngine) {
  super.configureFlutterEngine(flutterEngine)
  FlutterManager.init(this, flutterEngine)
}

The FlutterManager is not interesting, it just invokes a start method on a bunch of services we have that maintain MethodChannels. One of these is the logging service I mentioned earlier, and once that start is called, it flushes the queue of logs native wants to send to Flutter to be piped to Datadog.

Reporting back on the hybrid example/sample app:

Some notes: We had to bump the compileSdkVersion and minSdk of the android module at flutter_module/.android/app/build.gradle. After doing so, we had duplicate class issues that were resolved by adding a dependency on the Kotlin bom (implementation(platform("org.jetbrains.kotlin:kotlin-bom:1.8.0")) in the same file.

After doing the above and getting a successful build, we saw this log in the console which feels like it indicates that things may not be behaving as expected? Note that I added my own logs to the relevant MethodChannels to see what methods were being triggered.

I/System.out( 9182): BRANDON: Received methodCall setSdkVerbosity while Datadog#isInitialized false
I/System.out( 9182): BRANDON: Received methodCall attachToExisting while Datadog#isInitialized false
E/DatadogFlutter( 9182): 🔥 attachToExisting was called, but no existing instance of the Datadog SDK exists. Make sure to initialize the Native Datadog SDK before calling attachToExisting.
I/flutter ( 9182): [Datadog 🐶🔥 ] Failed to attach to an existing native instance of the Datadog SDK.

Thanks for checking. The example is meant to be built from the android directory in Android Studio. I should probably remove the .android directory from the flutter_module

I've tested with our hybrid example, but wasn't able to reproduce, but we initialize Datadog in the native layer in that example, then use attachToExisting. It also appears to not retain previous state. I will reverse the initialization and see if I can reproduce.

The only way I could see this happening at first glance is that for some reason Flutter is not calling onAttachedToEngine for our plugin, which is what sets up the method channels. But I don't know why that would work on initial app launch and not after a background termination.

I'll try to keep looking, but if you can supply a more minimal reproduction that would be a huge help.

Hi @btrautmann,

I've modified our hybrid example in this branch to be more similar to the setup you describe, but I can't reproduce the issue still. As with the other example, open the android directory in Android Studio and run from there.

It appears that onAttachedToEngine gets called as part of the super.configureFlutterEngine call, which also calls 'main'. Weirdly, I couldn't get method channel calls to work from configureFlutterEngine but by putting the Datadog initialization in main I was still able to get everything connected properly, even after the system terminated the application.

Is there anything about the configuration of the FragmentActivity that maybe could be contributing?.

@fuzzybinary thanks for looking into this for us, we really do appreciate it.

After toying with the sample app, which like you mentioned, did not reproduce, we dug further and realized that we had some legacy code on the native side that would check to see if the app had been restored (new process id) and forcibly start a new FlutterFragmentActivity over the Activity that the system was in the process of attempting to restore. We believe this process (creating a new Activity which, due to its Intent flags would force the one being restored to be finished) was leading to lifecycle overlap and potentially borking plugin registration. Likely (though I haven't really confirmed this) Datadog is the first plugin we're interacting with on startup and so was just a red herring. It is interesting though that it was only seen when upgrading Datadog to the version that used a singleton. That may actually be a contributing factor, but given our inability to reproduce on your hybrid project my gut is telling me the funky thing we're doing on our end is the actual root cause.

We have a fix on our end and if it doesn't end up resolving the issue we'll dig deeper and re-open if we believe it to be a Datadog issue. Thanks again for your help and I hope we weren't too much of an inconvenience!

Sorry I couldn't help more and hopefully you can get to the bottom of it!

Sorry the singleton change broke things. There's some oddities in the way Flutter handles plugins that necessitated the change, but I'm actively thinking about how I can restructure so that the Singleton nature of the the native SDKs don't get in the way of the non-singleton nature of Plugins (especially in hybrid scenarios). Not sure when I'll be able to perform that refactor though.