facebook/react-native

Async JSI functions with promises block the event loop indefinitely

mfbx9da4 opened this issue Β· 20 comments

Description

While playing with the JSI I have found that using Promises in conjunction with async JSI functions can lead to the JS thread being blocked. According to the react native profiler the JS thread is still operating at 60fps but the JS thread is not responding to tap handlers and never resolves the promise. Note this happens sometimes and is hard to predict which suggests there is some race condition.

I am able to unblock the JS thread by periodically kicking the event loop into action with something like

const tick = async () => {
  await new Promise(r => setTimeout(r, 1000))
  tick()
}
tick()

I've also noticed that using Promises in conjunction with the JSI can occasionally lead to extremely long await times for the promise to be resolved, in the order of 250ms.

I have only tested this on iOS without Hermes. Occurs on both simulator and device.

Version

0.67.1

Output of npx react-native info

System:
    OS: macOS 12.0.1
    CPU: (10) x64 Apple M1 Max
    Memory: 20.13 MB / 32.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 14.18.2 - ~/.nvm/versions/node/v14.18.2/bin/node
    Yarn: 1.22.17 - ~/.nvm/versions/node/v14.18.2/bin/yarn
    npm: 6.14.6 - ~/.nvm/versions/node/v14.18.2/bin/npm
    Watchman: 2021.12.20.00 - /usr/local/bin/watchman
  Managers:
    CocoaPods: 1.11.2 - /usr/local/bin/pod
  SDKs:
    iOS SDK:
      Platforms: DriverKit 21.0.1, iOS 15.0, macOS 12.0, tvOS 15.0, watchOS 8.0
    Android SDK: Not Found
  IDEs:
    Android Studio: Not Found
    Xcode: 13.1/13A1030d - /usr/bin/xcodebuild
  Languages:
    Java: Not Found
    Python: 2.7.18 - /usr/bin/python
  npmPackages:
    @react-native-community/cli: Not Found
    react: 16.13.1 => 16.13.1 
    react-native: 0.63.4 => 0.63.4 
    react-native-macos: Not Found
  npmGlobalPackages:
    *react-native*: Not Found

Steps to reproduce

A minimal failing example can be found here https://github.com/mfbx9da4/react-native-jsi-promise

1

Create an async native JSI function which spawns a thread, does some work and then resolves once it's done its work. You can use a callback or a promise to resolve from the native function. I've opted to use a callback as there are less moving parts this way. You could also use std::thread instead of GCD's dispatch_async for spawning the thread, the behaviour is observed with both. I've opted to use dispatch_async below.

    auto foo = jsi::Function::createFromHostFunction(
        jsiRuntime,
        jsi::PropNameID::forAscii(jsiRuntime, "foo"),
        1,
        [cryptoPp, myqueue](jsi::Runtime& runtime, const jsi::Value& thisValue, const jsi::Value* arguments, size_t count) -> jsi::Value {

            auto userCallbackRef = std::make_shared<jsi::Object>(arguments[0].getObject(runtime));

            dispatch_async(myqueue, ^(void){
                auto val = jsi::String::createFromUtf8(runtime, std::to_string(std::rand()));
                auto error = jsi::Value::undefined();
                userCallbackRef->asFunction(runtime).call(runtime, error, val);
            });

            return jsi::Value::undefined();
        }
    );
    jsiRuntime.global().setProperty(jsiRuntime, "foo", std::move(foo));

Source code here

2

Call the JSI function in a loop resolving the promise

for (let i = 0; i < 10; i++) {
    const start = Date.now();
    await new Promise(r => {
      jsiPromise.foo((err, x) => {
        console.log(i, 'err, x', err, x, Date.now() - start);
        r(x);
      });
    });
  }

Source code here

It might take several refreshes to get stuck but the JS thread does eventually get stuck.

Snack, code example, screenshot, or link to a repository

https://github.com/mfbx9da4/react-native-jsi-promise

Does this happen on JSC as well as on Hermes?

Only tested JSC and iOS thus far but present on device and simulator. I have now updated the description.

Hey @mfbx9da4, I believe you've already seen how we approached fixing a similar issue in Realm: realm/realm-js#4330. In our case, we'd see the UI being out of sync after any kind of async update had happened. As soon as you touched the screen, the UI would update. I'm not sure if that's exactly the same as what you are seeing, but it sounds pretty similar.

Hopefully my description on our issue is clear – in essence it seems that React Native queues up async work created by e.g. setTimeout or promises (which use setImmediate under the hood) as "microtasks", and then relies on messages coming via the bridge to flush that queue: https://github.com/facebook/react-native/blob/main/Libraries/BatchedBridge/MessageQueue.js#L108. This is the same on both JSC and Hermes, from what I can see.

Because we bypass the React Native bridge in Realm, and instead work directly with the JavaScript engine, when an async function call from our native code to JS happened, this did not hit any React Native codepath, and so RN had no way of knowing it should flush this queue as a result of the change. I guess your situation could be similar, I wasn't quite sure what I was looking for in your sample app, both buttons seemed to work OK.

We are currently thinking we will go with an approach of manually calling require("react-native/Libraries/Core/Timers/JSTimers.js").callReactNativeMicrotasks() whenever we make a native -> JS async call (or with some kind of throttling) – we couldn't find a more elegant way to trigger this. One alternative we considered was sending a dummy message via the React Native bridge in the usual native module style, which would have the effect of triggering a flush of the microtask queue.

I'd be interested to know if anyone has any other thoughts on more elegant ways to solve this though!

This is copied from my replies to @kraenhansen at the #jsi channel in Discord. I just realized the contexts are not exactly the same.

Thanks for flagging this. This is a great question. We also realized the original bridge-based microtask mechanism wouldn't work for sync C++ -> JS call in the new architecture.

I added drainMicrotasks to JSI and this performMicrotaskCheckpoint to JSIExecutor a while ago for the exact purpose you mentioned (to flush microtasks on the JS side before getting back to C++).

That being said,

  1. It wouldn't be available (until, hopefully, March) because we are still in the process of migrating to that. Under the hood, it's a completely different mechanism with the JSTimer::callReactNativeMicrotask so we have to be careful.
  2. I wanna raise up another concern which is that JSI API need integration from JS engine. So actually, I think we may only migrate RN on Hermes to this "new JSI-based microtask mechanism" because it may only works for Hermes. It wouldn't just work for JSC/V8 until some works have been done there.

So...I think it's about right to directly call require("react-native/Libraries/Core/Timers/JSTimers.js").callReactNativeMicrotasks() for now (for Hermes) and for a while (for JSC, etc.) to unblock any development.

If it became a norm, I think it may make sense to expose a drainMicrotasks to globalThis or something equivalent for library developers to call from the JS side as well. πŸ€” Although technically, this probably shouldn't be exposed to JavaScript.

I'll keep you posted whenever I have a update, but also feel free to follow-up!

Hey @Huxpro , thanks so much for the detailed answer and it’s great to know we were on the right track!

We implemented the solution of calling callReactNativeMicrotasks() immediately after we make any call from C++ to JS, and it improved things a lot, but we do still see occasional issues where the UI is not updated until something else (e.g. a touch) triggers a bridge message.

I also tried implementing a solution where we send a β€œdummy” message over the bridge with every C++ -> JS update, using the usual native modules method, and this works perfectly, but feels a little inelegant.

I was wondering whether there might be more going on under the hood when you send a bridge message, i.e. we are missing some other important part when we just call callReactNativeMicrotasks() directly – but your response makes me think this could be a timing thing/race condition instead – e.g. sending a bridge message takes longer than a direct call into JS, and when we callReactNativeMicrotasks() directly, occasionally some microtask has not even been queued yet.

I will experiment a little and report back. Thanks for the input!

Thanks for all your input on this. Sorry I haven't responded yet, I haven't got round to testing the proposed solutions. For now, I'm avoiding using the JSI for asynchronous methods. @Huxpro could I get a link to the #jsi discord?

@tomduncalf

but we do still see occasional issues where the UI is not updated until something else (e.g. a touch) triggers a bridge message.

I've seen that and I'm not sure if that's an expected behavior or not.

Actually, I think instead of calling callReactNativeMicrotasks on JS, it's probably better to call JSIExecutor::flush on C++. But I don't understand why TurboModule aren't flushing things properly automatically for you.

I'll try to loop in someone understanding this better.

It looks like you call that method from a different thread.
I guess you should schedule that callback to JS thread with Jscallinvoker which if I remember correctly is keeping count of tasks like that.

@Huxpro Thanks for the pointer to try calling JSIExectuor::flush! This seems to work perfectly if we call it after every C++ to JS call, so it seems that some of the extra work flush does vs callReactNativeMicrotasks is important.

However, I couldn't work out a way to call flush from our code without adding some extra code to React Native to expose flush on the RCTCxxBridge: https://github.com/facebook/react-native/pull/33201/files. Perhaps you know of some other way we can access this from outside?

Regarding "I don't understand why TurboModule aren't flushing things properly automatically for you" – we are calling directly from C++ to JS, using JSC’s JSObjectCallAsFunction in our master branch and the JSI equivalent, JSIFunc->call in our hermes branch, so I suspect RN has no way of knowing that anything needs to be flushed.

While this is probably a slightly unusual pattern for a React Native module to communicate to JS right now, our understanding is that with the new architecture, this kind of direct C++ to JS communication will become more commonplace, so others may hit the same issues – or is the intention that libraries still use the EventEmitter pattern with TurboModules?

To provide an update on this, I found that we were able to expose the jsCallInvoker member of the RCTBridge, which has a method invokeAsync which calls a lambda you pass to it on the JS thread, then calls flush(). This allows us to flush the UI queue without needing to modify React Native to expose JSIExecutor::flush() directly.

The lambda to invokeAsync could just be a no-op, but in our case I'm using to reset a flag which prevents us making multiple calls to invokeAsync if one is already pending (e.g. if many C++ to JS calls occur in a short space of time).

You can see how we've implemented this for Realm in this PR, essentially we create a std::function which calls invokeAsync and then pass this through our initialization code so that our "call JS function" abstraction (we abstract over multiple Javascript engines) can call it whenever we call into JS from C++.

I'd be interested to know if there is a more "official" solution for this scenario planned though!

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Although we have a workaround, I still feel this issue is important to get a proper fix for.

@kraenhansen does this still happen in 0.70?

We'll be releasing the first RC of 71 over the next couple weeks, might be worth checking out if it's still the case on that one

Although we have a workaround, I still feel this issue is important to get a proper fix for.

+1 to this. Also for anything New Architecture related like this one, posting on reactwg/react-native-new-architecture should be the preferred way as we're getting a lot of noise on the React Native issue tracker, causing issue like this one to get lost.

What should I do In JNI code

@310381453 here is how I did it for Android: PR. Some of the complexity may be related to how Realm is set up. I seem to recall that if you use Turbomodules it is easier to get hold of the async call invoker.

@kelset we're still seeing this on RN 0.71.0 (just tested with 0.71.0-rc.5 from our bindgen branch which doesn't have the workaround we mentioned above).

It looks like you call that method from a different thread. I guess you should schedule that callback to JS thread with Jscallinvoker which if I remember correctly is keeping count of tasks like that.

As mentioned by @Szymon20000, this could be a problem in your code example @mfbx9da4. JS runtime can only be accessed from single thread at a time. Access from multiple threads is undefined behaviour. Could you fix it and see if you can still repro the problem?

Since this issue is a big problem for us because this behavior blocks our callbacks for push messages if any setTimeout is currently running, we implemented some workaround. We built a timer hook that calls setTimeout without the milliesconds parameter periodically, hence it releases the javascripts macro task queue after each execution. In this function we check if the time elapsed since the timer starts.

Hope this workaround helps for all having this issue.

Hook implementation:

import { useCallback, useRef } from 'react';

/**
 * Props for the useTimeout hook.
 */
interface UseTimeoutProps {
    // number of milliseconds after that the callback is called
    timeout: number;

    // The callback that is called after the timeout exceeds
    callback: () => void;
}

/**
 * Hook that provides a setTimeout alternative to provide a workaround for the issue https://github.com/facebook/react-native/issues/33006 .
 * The callback given by the props is called after the timeout given by the props (in milliseconds) exceeds.
 * The hook returns a start and stop function that can be used to start and stop the timer.
 *
 * @param props
 */
export const useTimeout = (props: UseTimeoutProps) => {
    const startDate = useRef<Date | null>(null);

    /**
     * Checks if the timeout was reached. If yes, the callback will be called. If no,
     * the method will queue itself to the JavaScript macro tasks to check again.
     */
    const macroTaskRun = useCallback(() => {
        if (!startDate.current) {
            return;
        }

        const tmpStartDate = new Date(startDate.current);
        tmpStartDate.setMilliseconds(tmpStartDate.getMilliseconds() + props.timeout);

        if (new Date() >= tmpStartDate) {
            props.callback();
        } else {
            setTimeout(macroTaskRun);
        }
    }, [props]);

    /**
     * Starts the timer.
     */
    const start = useCallback(() => {
        startDate.current = new Date();
        macroTaskRun();
    }, [macroTaskRun]);

    /**
     * Stops the timer.
     */
    const stop = useCallback(() => {
        startDate.current = null;
    }, []);

    return {
        start,
        stop,
    };
};

usage:

...
const timeoutCallback = useCallback(() => console.log('finished'), [])

const { start, stop } = useTimeout({ timeout: 20000, callback: timeoutCallback });

useEffect(() => {
    start();
}, []);

useEffect(() => {
    stop();
}, [someStopCondition]);
...

Good news: In an attempt to reproduce the behaviour through a C++ native module in the RN Tester app, I finally realized the culprit of our issue. We were relying on platform primitives, such as the CoreFoundation RunLoop (on iOS) and ALooper on Android to schedule async work. As outlined in the description of realm/realm-js#6791 we should instead schedule work trough the React Native CallInvoker API. I consider this issue closed from the perspective of the realm package. A good followup action could be to document the importance of calling through the invoker and the reason for its existence.