JNI DETECTED ERROR IN APPLICATION: use of deleted global reference
la-matthew-yang opened this issue ยท 21 comments
I have been struggling with this issue for a while.
I have two questions:
-
What could potentially cause "use of deleted global reference" in libgdx-oboe?
-
Is there any way to show function name and line number for libgdx-oboe in backtrace?
For all the other so files, there are function name and lines numbers. But for libgdx-oboe (from #12 to #16), there are no such info. Is there a config that I can set to show such info?
Thanks a lot.
10-23 16:22:20.547 32591 32619 F libc : Fatal signal 6 (SIGABRT), code -6 in tid 32619 (Thread-1623)
10-23 16:22:20.611 324 324 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
10-23 16:22:20.611 324 324 F DEBUG : Build fingerprint: 'asus/CN_P024/P024_3:6.0.1/MMB29M/CN_P024-V5.7.3-20161011:user/release-keys'
10-23 16:22:20.611 324 324 F DEBUG : Revision: '0'
10-23 16:22:20.611 324 324 F DEBUG : ABI: 'arm'
10-23 16:22:20.612 324 324 F DEBUG : pid: 32591, tid: 32619, name: Thread-1623 >>> com.xuexue.ai.english.android <<<
10-23 16:22:20.612 324 324 F DEBUG : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
10-23 16:22:20.659 324 324 F DEBUG : Abort message: 'art/runtime/java_vm_ext.cc:410] JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x2005aa'
10-23 16:22:20.660 324 324 F DEBUG : r0 00000000 r1 00007f6b r2 00000006 r3 98d63978
10-23 16:22:20.660 324 324 F DEBUG : r4 98d63980 r5 98d63930 r6 0000000b r7 0000010c
10-23 16:22:20.660 324 324 F DEBUG : r8 b8870700 r9 b4c37378 sl 00000000 fp 00000000
10-23 16:22:20.660 324 324 F DEBUG : ip 00000006 sp 98d63198 lr b6d6bf25 pc b6d6e320 cpsr 40070010
10-23 16:22:20.685 324 324 F DEBUG :
10-23 16:22:20.685 324 324 F DEBUG : backtrace:
10-23 16:22:20.685 324 324 F DEBUG : #00 pc 00044320 /system/lib/libc.so (tgkill+12)
10-23 16:22:20.685 324 324 F DEBUG : #01 pc 00041f21 /system/lib/libc.so (pthread_kill+32)
10-23 16:22:20.685 324 324 F DEBUG : #02 pc 0001ba4f /system/lib/libc.so (raise+10)
10-23 16:22:20.685 324 324 F DEBUG : #03 pc 00018bf1 /system/lib/libc.so (__libc_android_abort+34)
10-23 16:22:20.685 324 324 F DEBUG : #04 pc 000167b0 /system/lib/libc.so (abort+4)
10-23 16:22:20.685 324 324 F DEBUG : #05 pc 0032140d /system/lib/libart.so (_ZN3art7Runtime5AbortEv+212)
10-23 16:22:20.685 324 324 F DEBUG : #06 pc 000f3cdd /system/lib/libart.so (_ZN3art10LogMessageD2Ev+2212)
10-23 16:22:20.686 324 324 F DEBUG : #07 pc 0024f86d /system/lib/libart.so (_ZN3art9JavaVMExt8JniAbortEPKcS2_+1524)
10-23 16:22:20.686 324 324 F DEBUG : #08 pc 0024fc79 /system/lib/libart.so (_ZN3art9JavaVMExt9JniAbortFEPKcS2_z+68)
10-23 16:22:20.686 324 324 F DEBUG : #09 pc 0033e3a7 /system/lib/libart.so (_ZNK3art6Thread13DecodeJObjectEP8_jobject+686)
10-23 16:22:20.686 324 324 F DEBUG : #10 pc 000fff25 /system/lib/libart.so (_ZN3art11ScopedCheck5CheckERNS_18ScopedObjectAccessEbPKcPNS_12JniValueTypeE.constprop.95+900)
10-23 16:22:20.686 324 324 F DEBUG : #11 pc 00107837 /system/lib/libart.so (_ZN3art8CheckJNI14GetObjectClassEP7_JNIEnvP8_jobject+370)
10-23 16:22:20.686 324 324 F DEBUG : #12 pc 00046b33 /data/app/com.test.android-1/lib/arm/liblibgdx-oboe.so
10-23 16:22:20.686 324 324 F DEBUG : #13 pc 00045ab7 /data/app/com.test.android-1/lib/arm/liblibgdx-oboe.so
10-23 16:22:20.686 324 324 F DEBUG : #14 pc 0004421b /data/app/com.test.android-1/lib/arm/liblibgdx-oboe.so
10-23 16:22:20.686 324 324 F DEBUG : #15 pc 0004409f /data/app/com.test.android-1/lib/arm/liblibgdx-oboe.so
10-23 16:22:20.686 324 324 F DEBUG : #16 pc 0005111d /data/app/com.test.android-1/lib/arm/liblibgdx-oboe.so
10-23 16:22:20.686 324 324 F DEBUG : #17 pc 000085d5 /system/lib/libwilhelm.so
10-23 16:22:20.686 324 324 F DEBUG : #18 pc 000609c1 /system/lib/libmedia.so (_ZN7android10AudioTrack18processAudioBufferEv+1232)
10-23 16:22:20.686 324 324 F DEBUG : #19 pc 00061631 /system/lib/libmedia.so (_ZN7android10AudioTrack16AudioTrackThread10threadLoopEv+172)
10-23 16:22:20.686 324 324 F DEBUG : #20 pc 0001011d /system/lib/libutils.so (_ZN7android6Thread11_threadLoopEPv+112)
10-23 16:22:20.686 324 324 F DEBUG : #21 pc 00041823 /system/lib/libc.so (_ZL15__pthread_startPv+30)
10-23 16:22:20.686 324 324 F DEBUG : #22 pc 00019285 /system/lib/libc.so (__start_thread+6)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: Exception dealing with report
10-23 16:22:22.683 1196 2349 E NativeCrashListener: android.system.ErrnoException: read failed: EAGAIN (Try again)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at libcore.io.Posix.readBytes(Native Method)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at libcore.io.Posix.read(Posix.java:169)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at libcore.io.BlockGuardOs.read(BlockGuardOs.java:230)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at android.system.Os.read(Os.java:367)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at com.android.server.am.NativeCrashListener.consumeNativeCrashData(NativeCrashListener.java:240)
10-23 16:22:22.683 1196 2349 E NativeCrashListener: at com.android.server.am.NativeCrashListener.run(NativeCrashListener.java:138)
- It's actually very odd. Either it's a multi-threading problem that I overlooked (related to get/set variables of kotlin classes from the native part of the code), or music end callback from
setOnCompletionListener
. - All the libraries you have in your stacktrace are built with debug symbols in them (probably, you're running a debug build). But libgdx-oboe is distributed in the release version, without such symbols. The solution is to clone the repo, and declare a local dependency instead of the remote one. In that case, you'll have to setup NDK and all other requirements for native andriod development.
So are you using threads other than main in the app ? If not, are you using setOnCompletionListener
method ?
I think I have finally figured out what is causing this issue. :)
Essentially, it is because completion listener is notified after OboeMusic.dispose()
is already called. When this happens, calling jvm_class(context->GetObjectClass(new_callback))
causes JNI ERROR: use of deleted global reference
.
I am not sure how this is possible. Because OboeMusic.stop()
is always called before OboeMusic.dispose()
. Maybe OboeMusic.stop()
has no effect when the music is very close to completion?
I tried to solve this issue by calling OboeMusic.setOnCompletionListener(null)
before OboeMusic.dispose()
. But it seems that OboeMusic.setOnCompletionListener(null)
is not supported.
Oh. It might actually be possible because of OboeMusic.setOnCompletionListener(null)
since it doesn't seems to reset native music callback. Shouldn't be possible after OboeMusic.dispose()
, because it just deletes the native music object (which actually calls the callback), though I'm not sure if this is some thread shenanigans, in that case everything might be possible.
I sadly can't test any of this, because my pc died (and my current can't run the heavy android toolchain). ๐
How often does it crashes ? Are crashes regular or you can't reliably replicate them ? Does it crashes if no OboeMusic.setOnCompletionListener(null)
is called ?
I think you are right about the thread shenanigans. Based on the logs, it seems that AudioTrack::AudioTrackThread::threadLoop()
triggers completion listener right after another thread calls OboeMusic.dispose()
.
Do you think adding a listener check in on_complete could fix the issue?
OBOEMUSIC_METHOD(void, setOnCompletionListener) (JNIEnv* env, jobject self, jobject callback) {
auto context = jni_context::acquire_thread();
if (auto old_callback = get_var_as<_jobject>(env, self, "listener")) {
context->DeleteGlobalRef(old_callback);
}
if (auto instance = get_var_as<std::shared_ptr<music>>(env, self, "music")) {
if (callback) {
auto new_callback = context->NewGlobalRef(callback);
auto self_weak = context->NewWeakGlobalRef(self);
set_var_as(env, self, "listener", new_callback);
(*instance)->on_complete([self_weak, new_callback] {
if (auto new_callback = get_var_as<_jobject>(env, self, "listener")) {
auto context = jni_context::acquire_thread();
auto callback_class = jvm_class(context->GetObjectClass(new_callback));
callback_class.execute_method<void(Music)>(new_callback, "onCompletion", Music{self_weak});
}
});
}
}
}
Off the top of my head, I think something like that might work:
OBOEMUSIC_METHOD(void, setOnCompletionListener) (JNIEnv* env, jobject self, jobject callback) {
auto context = jni_context::acquire_thread();
if (auto old_callback = get_var_as<_jobject>(env, self, "listener")) {
context->DeleteGlobalRef(old_callback);
old_callback = nullptr;
set_var_as<_jobject>(env, self, "listener", nullptr);
}
if (auto instance = get_var_as<std::shared_ptr<music>>(env, self, "music")) {
if (callback) {
auto new_callback = context->NewGlobalRef(callback);
auto self_weak = context->NewWeakGlobalRef(self);
set_var_as(env, self, "listener", new_callback);
(*instance)->on_complete([self_weak, new_callback] {
if (auto new_callback = get_var_as<_jobject>(env, self, "listener")) {
auto context = jni_context::acquire_thread();
auto callback_class = jvm_class(context->GetObjectClass(new_callback));
callback_class.execute_method<void(Music)>(new_callback, "onCompletion", Music{self_weak});
}
});
} else {
(*instance)->on_complete(nullptr);
}
}
}
and
OBOEMUSIC_METHOD(void, dispose) (JNIEnv* env, jobject self) {
if (auto instance = get_var_as<std::shared_ptr<music>>(env, self, "music")) {
(*instance)->on_complete(nullptr);
}
if (auto callback = get_var_as<_jobject>(env, self, "listener")) {
auto context = jni_context::acquire_thread();
context->DeleteGlobalRef(callback);
callback = nullptr;
set_var_as<_jobject>(env, self, "listener", nullptr);
}
delete_var<std::shared_ptr<music>>(env, self, "music");
}
And if that doesn't work, then some spinlocks are required sadly
I think this looks good. I really want to give this a try. But I don't have much experience with NDT or ffmpeg, so I am not sure how long it will take for me to have everything setup. Does the build process work for Mac OS?
Does the build process work for Mac OS?
Yeah, I assume so, yes. You actually don't need anything other than Android NDK and regular android setup. ffmpeg library is pre-built, and normally shouldn't be rebuilt
Thanks! I will give it a try today.
I have the project setup. I am trying to debug the problem. Do you know how to generate the debug symbols so that I can use ndk-stack?
I got ndk-stack working. :) You code is well organized.
After some investigation, I think I have found the root cause of the issue.
It is actually not caused by native completion listener. If I remove the usage of the native completion listener completely, the issue still occurs. Although based on my limited testing, usage of native completion listener increases the chance of the issue occurring.
I think the issue is caused by concurrent modification of audio resources. If music::dispose and music::stop are never called, the issue doesn't occur at all.
The issue probably occurs when a music is being used in mixer::render by the system audio thread, while it is disposed or stopped by a game thread at exactly the time.
The resulting native error has many different stacktraces. Most of them have the message "fault addr". Some have "ConcurrentGC" in their stack trace.
Sometimes, the issue doesn't cause a crash. Instead, it causes Oboe system to stop working. Afterwards, Gdx.audio.newMusic() hands and causes app to get stuck.
I tried putting thread lock in different places, but couldn't get it to work.
The options that I can think of for fixing this issue are:
- Mark music as disposed or stopped in game thread and perform the actual dispose and stop in mixer::render to avoid concurrent usage
- Use one oboe stream with SharingMode::Shared for each music. This way, music is only modified by its own audio thread. The sound mixing is done by the Android system.
The second option is great. Don't know why I didn't use shared oboe streams in the first place ๐คท
That may still require a bunch of atomic_flag
tests, though.
I should get my computer fixed soon, I'll look into it. Although, refactoring will likely take some time.
Big thanks for suggestions and tinkering with the code ๐
@barsoosayque Do you have a preferred way of communication like WhatsApp, Skype, Email, etc? I would like to talk to you about some potential collaboration opportunities, if you don't mind.
Great! I've sent you an email.
I have refactored callback logic in a0ad49b. And per my tests in 612edb2, I can't get any concurent crashes. I want to believe that the issue is somehow fixed, but pretty much open to suggestions on how to test it further.
After some thoughts, I think it's preferable to leave oboe stream as exclusive, since it offers the lowest latency.
https://github.com/google/oboe/blob/83f79e93e32dee9e2b5d3ced07dc94c82d4bc182/include/oboe/Definitions.h#L193-L213
Well, I would like to believe that this issue is fixed, so I will close it. If the crash still occurs, please reopen.