DataDog/dd-sdk-flutter

Plugin does not log app's startup crashes.

Den-creator opened this issue · 17 comments

Describe the bug

If the application crashes upon startup, the Datadog Flutter SDK fails to upload a crash report to Datadog upon application restart. The crash report only appears after the next successful restart and usage of the app. This implies that if a crash occurs during app startup, we will never be aware that some users have experienced a crash. However, the documentation states:

If your application suffers a fatal crash, after your application restarts, the Datadog Flutter SDK uploads a crash report to Datadog. For non-fatal errors, the Datadog Flutter SDK uploads these errors with other RUM data.

Reproduction steps

Future<void> main() async {
  await DatadogSdk.runApp(
    DatadogConfiguration(
      clientToken: 'clientToken',
      env: 'DEV',
      site: DatadogSite.us1,
      nativeCrashReportEnabled: true,
      rumConfiguration: DatadogRumConfiguration(
        applicationId: 'applicationId',
        reportFlutterPerformance: true,
      ),
      loggingConfiguration: DatadogLoggingConfiguration(),
    ),
    TrackingConsent.granted,
    () async {
      await Firebase.initializeApp(
        options: DefaultFirebaseOptions.currentPlatform,
      );
      FirebaseCrashlytics.instance.crash();
      runApp(App());
    },
  );
}

SDK logs

No response

Expected behavior

After first crash on app's startup, if user opens app again - send crash report to data dog before second crash will happen again.

Affected SDK versions

2.3.0

Latest working SDK version

No response

Did you confirm if the latest SDK version fixes the bug?

No

Flutter Version

3.16.9

Setup Type

Flutter Application

Device Information

iOS 17.3.1, iPhone 12, Wifi, battery

Other relevant information

No response

Hey @Den-creator,

This is accurate yes. There are actually two potential problems.

First, we initialize crash tracking as part of initialization. If your app crashes before then, we unfortunately won't catch it and can't send it. We are actually actively looking for ways to improve this that both allow configurability and comply with a user's tracking consent but that's the way it is for now.

Second, crashes are added to the next uploadable batch on app restart, but aren't sent immediately. If we never have time to send a batch, we won't be able to report the crash. I'll discuss with the team potential solutions for this, but the only "guaranteed" solution I can think of would be to prevent initialization from finishing until we've sent, or at least attempted to send, the crash report, which has the tradeoff of making that method significantly slower.

@fuzzybinary, thank you for the reply. I agree with your statement:

The only "guaranteed" solution I can think of would be to prevent initialization from finishing until we've sent, or at least attempted to send, the crash report, which has the tradeoff of making that method significantly slower.

I was expecting such behavior from the SDK, as the current crash reporting mechanism is useless in cases when the app crashes on startup or almost immediately after. The tradeoff may not be significant compared to the issue we are currently facing. You could implement this behavior as optional, allowing developers to disable it if they prefer not to slow down the initialization process.

Could you please confirm whether you will be able to implement the above? If so, could you please provide an estimate of when it will be ready?

Re-taging as an enhancement rather than a bug. Can you reach out to your CSM to raise a feature request? We've started talking internally about ways to implement this but unfortunately I can't provide a timeline.

Can you reach out to your CSM to raise a feature request?

@fuzzybinary I will forward this to out team. Thanks !

@fuzzybinary I suggest taking a look at how competing products work, like Sentry.

@fuzzybinary how can we raise a feature request ? Should it be done via support button on data dog website ?
image

Hi @Den-creator,

Yes, if you don't have a CSM, submit it through Datadog support.

To keep everyone informed, we are looking a partial fix coming up soon. While we won't delay initialization, we will start sending data faster, which should capture more.

Sorry, but where can I find CSM ?

CSM is your Customer Success Manager. Not every client has one, but they would be your primary contact with Datadog if you do.

If you don't have one, you can use the Support chat and the feature request will be routed to the correct place.

Thanks for reply !

Hi folks,

datadog_flutter_plugin 2.5.0 has a change in the iOS and Android SDKs that will start sending data immediately on initialization. While this doesn't ensure that all crashes at startup will be caught and sent, it should improve the situation dramatically.

I'm going to keep this issue open so we can track potentially adding a "stall" during initialization to ensure crashes are sent, but that requires a bit more discussion on our side.

Maybe we can have something more flexible.

The sdk might initialise and try to upload the last batch in a separate thread. If the app crashes, it may not finish the upload successfully.

So you might add a counter, increasing the counter at every attempt of upload and zeroing the counter upon success. If the counter hits 3 attempts, you stall the app's initialisation and assure the batch was uploaded.

My reasoning is that if the app is crashing immediately after it starts, then the user won't care if the app stalls for a little bit at start-up, and then crashes again. The app is bad anyway, the user can't use it, the SDK stalling the app won't make any negative impact, as the app is already unusable. While in a good app, the upload should be completed in a separate thread with no issues, not locking the app initialisation at all.

Thoughts?