tszheichoi/awesome-sensor-logger

App crashes after recording about 50,000 data points

Closed this issue · 9 comments

We are currently testing the sensor logger app to record accelerometer measurements with an Android smartphone and send them to a REST endpoint. While it works great in general (thanks a lot!), we discovered that the app crashes after recording about 50,000 data points. This seems to be quite reproducible on three different Android devices.

For further diagnosis, adb logcat gives the following output at the time of the crash:

...
10-19 15:33:28.681  3095  3095 F DEBUG   : Cmdline: com.kelvin.sensorapp
10-19 15:33:28.681  3095  3095 F DEBUG   : pid: 15981, tid: 32467, name: impulse.com/...  >>> com.kelvin.sensorapp <<<
10-19 15:33:28.681  3095  3095 F DEBUG   : uid: 10000
10-19 15:33:28.681  3095  3095 F DEBUG   : signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
10-19 15:33:28.681  3095  3095 F DEBUG   : Abort message: 'JNI ERROR (app bug): global reference table overflow (max=51200)global reference table dump:
10-19 15:33:28.681  3095  3095 F DEBUG   :   Last 10 entries (of 51199):
10-19 15:33:28.681  3095  3095 F DEBUG   :     51198: 0x157266a0 com.facebook.react.modules.blob.BlobModule
10-19 15:33:28.681  3095  3095 F DEBUG   :     51197: 0x157266a0 com.facebook.react.modules.blob.BlobModule
10-19 15:33:28.681  3095  3095 F DEBUG   :     51196: 0x157266a0 com.facebook.react.modules.blob.BlobModule
10-19 15:33:28.681  3095  3095 F DEBUG   :     51195: 0x157266a0 com.facebook.react.modules.blob.BlobModule
...

The log reports a reference table overflow. The maximum of 51,200 mentioned there seems to correspond quite well with the observed 50,000 data points.

A search on StackOverflow indicates that it is probably a problem of unreleased global references:
https://stackoverflow.com/a/42599885

We cannot go into further diagnosis as the app source code is not public. Hopefully, this input helps diagnosing and solving the issue. If you need any further input or log data, we are happy to help!

Thanks for bringing this issue to my attention. Is it not something I was aware of. I tested on my own Android device (Samsung Galaxy S21) using Sensor Logger 1.22.0 (see the version on the About page of the app) and was not able to reproduce this issue.

Can you please further detail step by step how you have reproduced this on multiple devices? In particular, include the batch period used for HTTP push, the enabled sensors and sampling rate, how long it typically takes for the app to crash, the sensor logger app version, and the device model you have tested on.

From a poke around at the code, I'm also wondering if there is a possibility that the HTTP internals in react-native are doing different things depending on the Content-Type of the response from the server. The fact the adb logs show "com.facebook.react.modules.blob.BlobModule" is an indication that for whatever reason BlobModules are getting leaked, and there might be a way to sidestep blob processing entirely if the http response from the server is of a different format. So can you also clarify the content-type in the response from the HTTP server you have used? If you are able to test using the HTTP server python code in https://github.com/tszheichoi/awesome-sensor-logger#live-data-streaming and see if the issue still reproduces.

Thank you for your quick response!

Execution environment:

  • Device: Google Pixel 4
  • Android version: 13
  • App version: 1.22.0 Build 3145844
  • Device has been reset completely, only Sensor Logger app is installed and running
  • (the other devices are off-site, so I do not have direct acces, but the users report the same problem)

App setup:

  • Logger
    • Activate logger "Accelerometer"
  • Settings
    • HTTP Data Push
      • Enable HTTP Push
      • Insert push URL
      • Batch period to 1s
      • Insert auth header
    • Sampling Frequencies
      • For accelerometer etc. "Sample Once Every 10 Seconds" (others unchanged)
  • Start Recording
  • (data transfer works as expected and the REST endpoint receives data and inserts it into database)

App crash:

  • On the two off-site devices test the data transfer for about three weeks now
  • The users reported the problem to occur regularly after a few days
  • The error log above was the first time we examined adb logcat on our test device here on-site
  • At the time the crash occurred, the REST endpoint had received
    • data from 2023-10-17 08:24 until 2023-10-19 15:32
    • about 50,000 x, y, and z accelerometer values

HTTP protocol:

  • Our REST endpoint returns HTTP status code 204 No Content on success
  • In general, that should be the appropriate response code from the 200-299 range if everything succeeded and the server has no further content to return
  • However, as you pointed out, it might be that the client library explicitly expects status code 200 OK instead

Next steps we will try:

  • Return status code 200 instead of 204 and start another test run
  • If the problem persists, start a test run with the proposed Python server

Hopefully, this feedback also helps you to reproduce the problem. Please let us know if you need any further input.

Thanks for the additional details. I have been able to spot a potential issue, which will be fixed in the next release 1.23.0.
The adb logcat you provided was very helpful in understanding this, so thanks in particular for that.

@arneschuldt FYI Version 1.23.0 has just been released on Android. Please test again and see if the issues still exist.

Thank you! We have started a new test run and will report on the results!

Version 1.23.0 has just been released on Android. Please test again and see if the issues still exist.

An update about the current state of our test run:

In our current test run with app version 1.23.0, we have already received about 60,000 data points (and are thus above the previous "threshold" of 50,000).

We will continue the test run for another few days and provide an updated report on the results. But it makes the impression as if the problem has been solved by the app update.

Hence, we will skip the following steps for the time being:

  • Return status code 200 instead of 204 and start another test run
  • If the problem persists, start a test run with the proposed Python server

@tszheichoi Judging from our "long-term" tests for a few days now, it seems as if the problem has been resolved with your app update 1.23.0. Thank you very much for ultra-fast problem solving!

I therefore propose to close this issue.

Nevertheless, we will extend the test run for another few days on the three test devices. If, contrary to our expectations, the problem arises again, we will re-open here with further details.

Thanks for the additional details. I have been able to spot a potential issue, which will be fixed in the next release 1.23.0. The adb logcat you provided was very helpful in understanding this, so thanks in particular for that.

@tszheichoi curious what the root cause of the issue was, facing a similar issue in my app

When I looked, it was to do with the response body not getting cleaned up. For me, I sorted out the leak as follows. I hope this helps!

let originalResp = null;
fetch(...).then(response => {
  originalResp = response;
  ...
}).finally(() => {
  if (
    originalResp != null &&
    typeof originalResp._bodyBlob === "object" &&
    typeof originalResp._bodyBlob.close === "function"
  ) {
    originalResp._bodyBlob.close();
    originalResp._bodyBlob = null;
  }
});