Asyncify: Improve developer experience
adamziel opened this issue · 9 comments
The problem
PHP.wasm crashes if the ASYNCIFY_ONLY functions list in packages/php-wasm/compile/Dockerfile
is not exhaustive enough:
Whenever it crashes, the developer must analyze the stacktrace, update the list, and rebuild PHP. That DX is rough and many developers won't be willing to approach Asyncify issues.
Tl;dr solution idea
Autogenerate the list of ASYNCIFY_ONLY
functions with an automated test suite that would exercise all possible code paths that may trigger a network call in PHP.
The full story:
Technical details
Asyncify lets synchronous C or C++ code interact with asynchronous JavaScript. Technically, it saves the entire C call stack before yielding control back to JavaScript, and then restores it when the asynchronous call is finished. This is called stack switching.
Stack switching requires wrapping all C functions that may be found at a call stack at a time of making an asynchronous call. Blanket-wrapping of every single C function adds a significant overhead, which is why we maintain a list of specific function names:
wordpress-playground/packages/php-wasm/compile/Dockerfile
Lines 624 to 632 in 15a6609
Unfortunately, missing even a single item from that list results in a WebAssembly crash whenever that function is a part of the call stack when an asynchronous call is made.
@danielbachhuber asked:
It seems like we need to find a more permanent solution to these function crashes. I don't think we'll be able to tag a "stable" release with this behavior.
Is it possible to reliably list all the relevant functions by hand?
I think so.
The current list is a result of a long trial-and-error process, but there is a away to systematize it, or perhaps even automate it.
Source of the crash
The only async operations are network requests – we can work backwards from here to find all the relevant Zend engine functions.
The only way to trigger an async request is via Emscripten's createPeer
function which is triggered by connect(2), listen(2) and sendmsg(2).
The internal network functions are called by the following top-of-the-stack code paths:
- PHP stream handlers like
getimagesize("https://...")
- PHP network-related functions like
fsockopen
ormysql_connect
.
Top-of-the-stack functions can be called by the following bottom-of-the-stack code paths:
- A function call (
myfunc()
,$obj->method()
,$reflection->callMethod()
,new MyClas()
, etc.) - Resource cleanup (destructors, request shutdown handlers, etc.)
The big idea
- Find all top-of-the-stack functions that trigger network calls
- Find all bottom-of-the-stack functions that trigger the above
- Write a test suite for a cartesian product of the two
- Auto-append any missing items to
ASYNCIFY_ONLY
- Rebuild, rinse and repeat
This way, we'd get a more reliable and systematic list of ASYNCIFY_ONLY functions.
There's still a risk of missing some code paths, but:
- It should happen much less frequently
- The fix would only involve creating a unit test that reproduces the crash
- We could involve the output of ASYNCIFY_ADVISE in the process to minimize the risks
Furthermore, perhaps the error message could include a link to a new issue form with the stack trace and other relevant details already prepopulated.
Why other ideas won't work so well and what we can do about it
Auto-listing all the functions
Asyncify can auto-list all the required C functions when built without ASYNCIFY_ONLY
.
Auto-detection is overeager and ends up listing about 70,000 C functions which increases the startup time to 4.5s. This can be seen by adding -sASYNCIFY_ADVISE
to the final emcc
in the Dockerfile. Edit: I just tried this again and seems to be significantly less now. There's still a lot of unrelated functions there, though. Dynamic calls issued by internal functions like zend_call_method
don't play well with static analysis.
Eventually, V8 will likely handle stack switching for us and remove this problem entirely. Unfortunately, the timeline for landing this in Node.js is unclear. @ThomasTheDane – do you have any insights about that, by any chance?
Recovering from a crash
We cannot recover. A crash leaves PHP in an undefined state. Were we in a middle of a request? In a request shutdown handler? Do we need to clean up any dangling resources? There's no telling! Once a crash happened, we must shutdown the PHP runtime.
We can, however, automatically open a new runtime. It will crash again eventually, but at least the developer won't have to manually restart the app.
Automating recompilation
@danielbachhuber asked:
Could we abstract the ASYNCIFY functions to some configuration file, and then have this error message code do a smarter job identifying the missing function?
The answer is yes, but only for developers working with node.js packages inside WordPress Playground repo. No bueno for npm package consumers and on the web. This could still be useful.
Another idea: don’t do stack switching.
Right now we can call WASM function b while the call stack from function a waits to be resumed. PHP relies heavily on a global state stored im a heap so we will never exercise this possibility.
What if, instead of saving and restoring the call stack, we would always resume the last execution? Would it remove the need for the allowlist?
The answer should be somewhere here:
- https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html
- https://github.com/WebAssembly/binaryen/blob/main/src/passes/Asyncify.cpp
Edit:
It seems like we can't skip stack switching. Asyncify seems to be "yielding back to JS" by transforming the compiled code in this fashion:
function asyncFunction() {
let x;
if (rewinding) { /* .. restore x .. */ }
if (normalExecution) {
x = anotherAsyncFunction();
}
if (waitingForJs) {
// .. save call index and x ..
return;
}
// .. further processing ..
return x;
}
The execution flow resembles this:
asyncFunction();
if ( waitingForJs ) {
// Everything after the async call short-circuited
// Let's await the async task now
await jsTask;
// Let's call the same function again, but set the
// global context in such a way to resume after the
// asynchronous part:
waitingForJs = false;
rewinding = true;
asyncFunction();
}
Asyncify must be aware of all the functions in the stack to add these conditions. Otherwise it couldn't short-circuit when one of the callees issues an async call. The ability to call another WASM function while waiting for async information is just a nice side-effect of how this process works.
Thank you so much @ThomasTheDane! Email pings unfortunately didn't transfer to GitHub so I looked up the usernames of all three folks you've mentioned – I hope I got it right! CC @fgmccabe @ajklein @dtig
Attempt at autogenerating the ASYNCIFY_ONLY
list based on a unit test suite: #253
At the risk of not fully understanding what you are trying to do ...
JSPI represents a different approach to ASYNCIFY, even though we use the emscripten flag ASYNCIFY=2 to enable it.
In essence, I think it is likely that JSPI addresses your concerns more directly.
With a JSPI'd (is that a verb?) WebAssembly module, when you call a marked export, that call is executed on a separate stack to the original. If the the call needs to be suspended, then the stack is simply put aside, and when the fetch request (say) is resolved, the 'put away' stack is reentered.
It is obviously more complex than this picture, but the essence is that the execution stack is not unwound/rewound during suspensions and resumptions. It also means that the only things that the developer has to pay attention to are the exports and imports involved.
@fgmccabe That's helpful, thank you! And it covers the essence of what I'm trying to do here, which is to make async calls without risking exceptions that can only be fixed by rebuilding the WebAssembly module.
I'd love to switch to JSPI entirely, however I understand it's only available in Chrome as of today. I understand the timeline for adoption in other browsers may be hard to reason about, but are there plans to bring JSPI to Node.js in the nearest future? It would solve this entire issue since only the Node.js version of WebAssembly PHP makes async calls at the moment.
The current implementation in V8 is essentially 'experimental status'. We have arm64 and x64 implementations.
The next steps are to implement on 32 bit arm/intel. That requires us to solve some issues that we did not have to solve so far.
As for node.js, my guess is that it is already in node, behind a flag.
To remove the flag requirement involves getting other implementations. The best estimate for that is towards the end of this year; but it obviously depends on resources and funding.
In addition, it would need further progress in the standardization effort; but, given that it is a 'small' spec, that should not be a long term burden.
Hope that this helps you understand the roadmap :)
This is very helpful @fgmccabe, thank you so much! I explored using JSPI in the Node.js version of WebAssembly PHP but got stuck at one point. I think I could push through by spending more time on a minimal reproduction of the problem, but since JSPI is still experimental and only available on arm64 and x64 I'll put it on a backburner for now. I'll keep watching the JSPI project and revisit adopting it here in a few months.
I just merged #253 which solves the core of this issue.
#253 avoids Node.js crashes, turns failures into catchable errors, ensures stack trace continuity, and makes the trace more useful by disabling minification of the JS module, introduces a unit test suite, and even automates updating the ASYNCIFY_ONLY
list.
The next step would be expanding the list of Asyncify test cases. Let's close this issue and track that separately in #273.