Network access in the browser
adamziel opened this issue ยท 20 comments
Latest status
Curl and tcp over fetch() are now a part of WordPress Playground ๐ Here's what we still need to close this issue:
Description
WordPress Playground only has a partial support for network calls.
Types of network calls in WordPress
wp_safe_remote_get
As of #724, Playground is capable of translating wp_safe_remote_get calls into JavaScript fetch() requests. This has limitations:
- Only https:// URLs are supported
- The server must provide valid CORS headers in the response
- Developers can't control all the headers
Arbitrary network calls
Other methods of accessing the network, such as libcurl or file_get_contents, are not supported yet.
Web browsers do not allow the WebAssembly code to access the internet directly yet. A native socket API may or may not be released in the future, but there isn't one for now. #1093 would improve the situation.
In Node.js, Playground access the network using the following method:
- Set up a same-domain API endpoint that accepts network commands from the browser
- Capture socket function calls in the WebAssembly binary
- Pass them to JavaScript
- Pass the requested operation over the API endpoint using the
fetch()orWebSocket
This may not be viable on the web as someone would have to pay for the hardware to run the proxy on, and the proxy's nature mean there are security risks related to accessing the local network.
Solution
After 1,5 years of exploring and discussing, this issue finally has a path forward:
- Merge #1093
- Merge #1273
- Ship curl in the browser
- This would enable requesting all CORS-enabled HTTPS endpoints via
file_get_contents,curl_execetc. and without going throughwp_safe_remote_get().
For full networking support, we'd also need the following:
- Expose the Node networking proxy as a separate, runnable script
- Provide an API to connect it to the in-browser version Playground
- Document the workflow
Nice to haves:
- Ship a version of the network built in PHP script to enable running a full-featured Playground build in the same environments as WordPress.
- Provide a Dockerfile to set up the network proxy and a few buttons for quickly spinning proxy cloud nodes on, e.g. CloudFlare, Digital Ocean, etc.
Limitations of the approach
Limitations without the network proxy:
- Non-CORS URLs wouldn't work
- Non-HTTPS traffic wouldn't work
- gethostname and other low-level methods still wouldn't work
- SSL certificate checks, like the ones done by Composer, wouldn't work
All of the above could be resolved by plugging in a network proxy.
Other Alternatives
For posterity: I tried a custom Request_Transport that tunneled all traffic through browser's fetch() using the vrzno extension by @seanmorris and that worked well except for sites that didn't allow cross-origin requests โ which is most sites.
Interestingly, I remember that WordPress Plugin Directory did not work in this setup. However, @dd32 pointed out that it exposes the correct access-control headers:
curl -is โhttps://api.wordpress.org/plugins/info/1.2/?action=query_pluginsโ | grep โ^access-controlโ
access-control-allow-origin: *
So perhaps there is a way to support at least the api.wordpress.org requests with the browser's native fetch()? Let's revisit this idea.
Networking is supported in the Node.js build as of #119 โ PHP sends data through a WebSocket to a local TCP proxy that handles the required network calls.
I can think of three ways to implement in-browser support:
- A server-side TCP proxy โ the least handy of all, has terrible security implications.
- An in-browser TCP proxy โ could be implemented as a browser extension, although Google Chrome deprecated the
socket.tcpAPI for extensions. - TCP to HTTP rewriting โ The
WebSocketclass could be replaced with one that concatenates all the sent data and then reconstructs afetch()call from them. Then, PHP can be compiled without OpenSSL support OR to treat allhttpsrequests ashttpones so that theWebSocketshim could read raw data. The proxy itself could work as a same-tabfetch()or as a browser-extensionfetch()to work around the CORS limitations. This wouldn't support arbitrary network traffic, but would be perhaps good enough for the most popular use-cases.
Also linking to this related discussion.
Libraries like Composer require HTTPS and they verify the peer certificate by default: https://github.com/composer/composer/blob/11879ea737978fabb8127616e703e571ff71b184/src/Composer/Util/StreamContextFactory.php#L183-L197
As a workaround, networking in the browser could:
- Give PHP a fake wildcard CA cert
- Implement a fake endpoint for all HTTP requests that would feed PHP the fake certificate
- Parse the incoming request and re-issue it using
fetch() - Parse the response, encrypt using the fake certificate, feed it back to PHP
This will only work for endpoints exposing proper CORS headers, but it's a start.
Give PHP a fake wildcard CA cert
why not use a real chain of trust?
I'm very leery of building a system whose default is to strip away all security from TLS connections and present trust for everything.
particularly if we're trying to make it easy to instantly spool up systems with a blueprint, this could so easily lead to cross-site attacks: "Hey look at the plugin I wrote: [malware link]"
for what it's worth, the default Erlang net library sets verify_peer to false and it's a disaster because nobody remembers to activate it and supply proper certs.
maybe I'm misreading this, but I'd rather us avoid that mistake if it's what I think we're talking about
why not use a real chain of trust?
We do in Node.js. Browsers canโt open raw TCP sockets so we need to re-issue the request using fetch(). The only way to do it is to MITM the PHP program to parse the encrypted request data.
Hosting a websocket proxy on e.g. free CloudFlare tier could solve this for now.
Hosting a websocket proxy
Possible candidates:
- websockify: WebSockets support for any application/server - novnc/websockify (Python)
- WebSockets Proxy: A websocket ethernet switch built using Tornado in Python - benjamincburns/websockproxy
- node-relay: A websocket ethernet switch built using Node.js - krishenriksen/node-relay
- zquestz/ws-tcp-proxy, snail007/goproxy (Go)
EDIT: Oh, I see there's already something like this implemented in @php-wasm/node, based on maximegris/node-websockify.
Oh, I see there's already something like this implemented in @php-wasm/node, based on maximegris/node-websockify.
Yup, it is used in the @php-wasm/cli, VS Code extension, and wp-now. The same proxy would just work with the web version if it was hosted somewhere. The custom parts were added to support setsockopt().
I wonder what could be achieved, if so, by using the Cloudflare TCP Sockets and running WP Playground on Cloudflare Worker / WASM / NodeJS?
#732 solves the bulk of the problem with issuing HTTP requests from WordPress. For full network support, we'll need to run a WebSockets proxy on the server.
not urgent -What sort of use case would require the websocket support?
#1051 implements a HTTPS termination function. All PHP-initiated network traffic is intercepted by a "fake WebSocket" instance which then offers a self-signed HTTPS certificate and reads the raw HTTP traffic, rewrites it as a fetch() call, and streams the response back to PHP. Note this may only work for HTTP and HTTPS requests to URLs exposing valid CORS-headers. It won't work for arbitrary sockets.
That PR needs a lot of cleaning up, but the concept seems to be solid. It would unblock support for libcurl and stream wrappers like file_get_contents("https://...").
It took 1,5 years but we now have a clear path to resolving this issue ๐
- Merge #1093
- Build PHP with libcurl, which @mho22 is exploring
This would enable requesting all CORS-enabled HTTPS endpoints.
For full networking support, we'd also need the following:
- Expose the Node networking proxy as a separate, runnable script
- Provide an API to connect it to the in-browser version Playground
- Document the workflow
The proxy wouldn't be hosted on Playground.wordpress.net as it would be a resource drain, but we could make spinning your own proxy instance easy enough.
Nice to haves:
- Ship a version of the network built in PHP script to enable running a full-featured Playground build in the same environments as WordPress.
- Provide a Dockerfile to set up the network proxy and a few buttons for quickly spinning proxy cloud nodes on, e.g. CloudFlare, Digital Ocean, etc.
@adamziel would love to chat about this at WCUS Contributor Day if you'll be around?
Curl is available in web browsers since #1935. fetch() is used as a network transport so the typical CORS limitations apply.
To solve, say, ~80% of the problem, we'd need to open up the CORS Proxy beyond talking to git. This is coming in the short to medium term.
To solve 100% of the problem, we'd need to tunnel the raw TCP traffic coming from Playground over a persistent WebSocket connection. In this scenario, we'd need a https://playground.wordpress.net/tcp-over-ws.php endpoint that would use stream_select to ingest data form Playground, pipe it to the network, and pipe the response bytes back to Playground. Definitely possible, especially with AsyncHttp\Client, but it's also non-trivial and I'm not sure what kind of appetite y'all have for such a feature. For now I'm taking a wild guess this is a very low priority project. If this is something that would help you, please comment on this issue and describe your use-case โ if enough people come in, I'm happy to make it happen.
For now, here's what we need to close this issue: