[web] `ort.InferenceSession.create` silently hangs/fails on iOS/iPad browsers if COEP/COOP headers are set
josephrocca opened this issue ยท 12 comments
Describe the bug
COEP/COOP headers must be set to cross-origin-isolate the page, which allows use of Wasm threads. If these headers are set, then the model doesn't load on iOS/iPad browsers. It simply "hangs" and never finishes initialization.
Urgency
The app is in production for thousands of users per day, but I'm not in a position to hurry you along ๐
System information
- ONNX Runtime version: https://cdn.jsdelivr.net/npm/onnxruntime-web@1.11.0/dist/ort.js
This bug only occurs on iOS and iPad browsers. I've tested the latest version of Chrome (v102) and Safari on iOS and iPad.
The bug does not occur on desktop browsers, including Safari and Mac. Every non-iOS/iPad browser that I've tested works fine.
To Reproduce
- Visit this minimal reproduction and try both links that you see: https://coop-coep-onnx-ios-1.joe64.repl.co/ The
/with-headers
page includes the COOP/COEP headers (which causes it to fail), and the/without-headers
page doesn't have the headers (and it works normally/correctly) - Here's the code for those pages: https://replit.com/@joe64/coop-coep-onnx-ios-1#index.js
Note that I've had to use Replit instead of a service like JSBin because JSBin doesn't allow you to set headers.
Additional context
- Note that, as you can see in the front-end code, I'm pre-downloading the file with
fetch
before passing it toort.InferenceSession.create
. This doesn't affect loading at all. I'm only doing this to ensure that the problem wasn't to do with model downloading. - For convenience, here's a direct link to the model being loaded in the above-linked minimal reproduction: https://huggingface.co/rocca/informative-drawings-line-art-onnx/resolve/main/model.onnx
- The only way I was able to debug Chrome iOS was to visit
chrome://inspect
and start the logger. I didn't observe any error messages, but it could be that error messages aren't being correctly shown on that page, so perhaps it's failing with an error that I'm unable to see.
This should be due to the same cause to #11567
My understanding is that multi-threading Web Assembly is not supported on Apple devices / Or at least iOS ( I am not 100% sure about this, correct me if I am wrong ). So with COEP/COOP headers set ORT Web thinks it is OK to enable the multi-threading, which causes the failure.
@fs-eire Ah I see. It looks like latest MacOS Safari has thread support, and the issue you linked was using latest Safari version, and in my testing MacOS Safari works fine, so I'm not too sure what's going on there. It is specifically iOS browsers (i.e. iOS WebKit) that don't work for me.
I've just tried using these support checks from this web.dev article and iOS WebKit apparently has support for everything except tail call and simd:
const bigInt = () => (async e => { try { return (await WebAssembly.instantiate(e)).instance.exports.b(BigInt(0)) === BigInt(0); } catch (e) { return !1; } })(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 6, 1, 96, 1, 126, 1, 126, 3, 2, 1, 0, 7, 5, 1, 1, 98, 0, 0, 10, 6, 1, 4, 0, 32, 0, 11]));
const bulkMemory = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 5, 3, 1, 0, 1, 10, 14, 1, 12, 0, 65, 0, 65, 0, 65, 0, 252, 10, 0, 0, 11]));
const exceptions = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 10, 8, 1, 6, 0, 6, 64, 25, 11, 11]));
const multiValue = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 6, 1, 96, 0, 2, 127, 127, 3, 2, 1, 0, 10, 8, 1, 6, 0, 65, 0, 65, 0, 11]));
const mutableGlobals = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 2, 8, 1, 1, 97, 1, 98, 3, 127, 1, 6, 6, 1, 127, 1, 65, 0, 11, 7, 5, 1, 1, 97, 3, 1]));
const referenceTypes = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 10, 7, 1, 5, 0, 208, 112, 26, 11]));
const saturatedFloatToInt = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 10, 12, 1, 10, 0, 67, 0, 0, 0, 0, 252, 0, 26, 11]));
const signExtensions = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 10, 8, 1, 6, 0, 65, 0, 192, 26, 11]));
const simd = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 5, 1, 96, 0, 1, 123, 3, 2, 1, 0, 10, 10, 1, 8, 0, 65, 0, 253, 15, 253, 98, 11]));
const tailCall = async () => WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 10, 6, 1, 4, 0, 18, 0, 11]));
const threads = () => (async e => { try { return "undefined" != typeof MessageChannel && new MessageChannel().port1.postMessage(new SharedArrayBuffer(1)), WebAssembly.validate(e); } catch (e) { return !1; } })(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 4, 1, 96, 0, 0, 3, 2, 1, 0, 5, 4, 1, 3, 1, 1, 10, 11, 1, 9, 0, 65, 0, 254, 16, 2, 0, 26, 11]));
So if those support checks are doing their job correctly, then it seems that this issue isn't to do with threads. That said, setting ort.env.wasm.numThreads = 1
does indeed fix the issue, so maybe the above threads support check is incomplete/wrong?
For now I've added a check for simd
support and set numThreads=1
if there is no support. This works for now, but perhaps only by accident ๐ค
I made a change to optimize the feature detection. however this need to be validated before merge.
I debugged my iOS Safari with the demo website. The failure on iOS is due to a RangeError: Out of memory
error.
This error should be happening inside WebAssembly.instantiate
. Since the single-thread version is totally working fine (and it can work with 100MB+ model), this issue is wired.
I have no clue to further debug the issue. I think this is simply because a bug in iOS Safari (mac safari works good). The PR #11707 does not help to resolve this because both the old and new detection returns true for multi-thread support.
Is there a minimal version of that RangeError: Out of memory
that could be submitted as a webkit bug report?
And in the meantime, is it possible to catch the RangeError: Out of memory
and then if it's iOS, try falling back to single-threaded? Much better for it to work slowly than to not work at all on iOS.
I'm also experiencing this issue with multi-threaded + COOP headers with browsers on IOS devices.
- https://github.com/jobergum/browser-ml-inference
- Live demo over at https://aiserv.cloud/
I tried this change so in my case I'm not able to work-around this with ort.env.wasm.numThreads = 1
Update: I had to remove COOP headers to make it work again on IOS devices, my model is pretty large, though, 22MB.
Is there any workaround or fix yet besides disabling multithreading on iOS? I'm still seeing this issue a year later and I'd rather not disable multithreading on iOS, that'd be half of our userbase.
It has been some time and I am not sure if Apple managed to fix this problem for Safari on iOS. @d12 did you observe the issue still happening on iOS?
iOS 17.6 the problem is still present
Seeing a crash during ort.InferenceSession.create
on iOS 16.7.8 even with ort.env.wasm.numThreads = 1
Seems to crash on iPhone 8 but run slowly on iPhone X, both on iOS 16.7.8 so I'd be curious what else can be done to reduce the memory ceiling and prevent crashes.
When attempting to use 1.19.0-dev.20240727-1ce160883f
I get this error which I can't find any leads to solve:
wasm streaming compile failed: CompileError: WebAssembly.instantiateStreaming(): section (code 1, "Type") extends past end of the module (length 36659183, remaining bytes 12804882) @+8
Seeing a crash during
ort.InferenceSession.create
on iOS 16.7.8 even withort.env.wasm.numThreads = 1
Seems to crash on iPhone 8 but run slowly on iPhone X, both on iOS 16.7.8 so I'd be curious what else can be done to reduce the memory ceiling and prevent crashes.
When attempting to use
1.19.0-dev.20240727-1ce160883f
I get this error which I can't find any leads to solve:wasm streaming compile failed: CompileError: WebAssembly.instantiateStreaming(): section (code 1, "Type") extends past end of the module (length 36659183, remaining bytes 12804882) @+8
This is wired. Did you verify whether it works on other environment? (windows/mac/android)