Require authentication on proxy endpoint

Am away from proper keyboard at the moment to follow up closely.

I thought there was an issue or discussion somewhere already on this but if this is a duplicate issue or inapplicable, feel free to close.

The corsProxy URL must require authentication so as to not expose am open/public proxy.

@bourgeoa

@csarven @jeff-zucker @zg009

The CorsProxy is implemented in

node-solid-server/lib/handlers/cors-proxy.js

Line 66 in f5652f3

app.get(path, extractProxyConfig, corsHandler, proxyHandler)

3 middlewares are called :

path is the proxy path. By default /proxy
a NSS express middleware extractProxyConfig

node-solid-server/lib/handlers/cors-proxy.js

Lines 69 to 70 in f5652f3

// Extracts proxy configuration parameters from the request

function extractProxyConfig (req, res, next) {

checks are made that the destination url is Valid and that the is not in unauthorized list else return 400

We can implement in this middleware some more controls about

is this an authenticated request ?
- to whom shall these parameters depend on : the pod provider, the pod owner, the authenticated user ...
- how do we check for authenticated request ?
  - Is there a way to know who is authenticated ?
is the destination Url allowed ? white/black list
- the list location depends on the answers on 1.
  - This could be on server code
  - on the pod like /proxy/whiteList.ttl
  - on/through the WebID document of the authenticated user

There is a need for a better understanding of the use cases and the security level expected.

In #111 @csarven writes : "CORS is PITA (Pain in the Ass) and for Solid applications to take advantage of resources beyond those on a Solid server or a server that properly implements CORS, there needs to be use of a proxy. Otherwise, a whole category of use cases for Solid applications is a non-starter. Again, dokieli has plenty of experience on this - resources could be anywhere. And undoubtedly other applications have run into similar shortcomings"

As mentioned previously, several of my applications are up against the same shortcomings of CORS as dokeli. Basically, without a proxy, entire realms of data are denied to us. I do not think we need further evidence that a CORS proxy for reading is needed. I do not see a need for writing via a proxy. If we allow writing via the proxy, then yes, for sure we need controls. But for just reading, why would we need to know that a user is authenticated or anything about their destination? Whitelist destinations? Why? How? Forbid proxying for non-RDF data? - that cuts off huge amounts of data that become unusable by client applications.

Even if we do eliminate or restrict remote server proxying, the self-host option can benefit from proxying and in the single-user case there is absolutely no need for restrictions of the kinds proposed.

I'm not sure if corsProxy and authProxy ( https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/auth-proxy.js ) needs to be separate things. The proxy "resource" (service or endpoint or whatever) should be CORS enabled and require authn/z. So, 1) corsProxy may need to be depreacted and 2) authProxy brough up to speed? I don't know the inner details of these libs.

@jeff-zucker , but even for reading, personal or otherwise public (unauthenticated) proxies pose a concern for misuse because the URL would be known to anyone. That can be anything from random bots poking at it, people using someone else's network/bandwidth, to downright retrieval of content against the law (in a nutshell). There may be ok/safe/controlled/narrow cases for the proxy URL to be used without authentication, but once the URL is known, it becomes a concern.

I just want to highlight that this is probably step one on at least limiting use to only authorized users. For multi-user setups / community servers (where anyone can sign-up) that needs separate / additional layers of ensuring if and how the proxy or proxies can be used. That cuts into ToS and stuff, and having some kind of moderation - which gets complicated in and itself if it is anything beyond scripts differentiating allowed/disallowed use. I'm not an expert in this area. Just saying that it needs some care and more eyes, and possible guidance to server admins / owners.

@csarven I'm new to CORS proxies myself so I have to ask: If it's being provided through a Solid-compliant resource server as a service, what would be needed beyond an authorization check for an authorized request through the proxy? Or are my assumptions about the flow of this wrong?

	// Extracts proxy configuration parameters from the request
	function extractProxyConfig (req, res, next) {