crim-ca/weaver

Weaver passthrough all HTTP headers from client's request up to the providers

Closed this issue · 7 comments

Describe the bug

When a client POST a task to Weaver, all HTTP headers from the POST request is passed as-is to the WPS provider, which can lead into unexpected behavior and potential security issue.

How to Reproduce

If the WPS provider is hosted behind Cloudflare, Cloudflare will reply with a 400 error with no reason or message because the Host header is invalid.

Expected behavior

Only relevant headers should be passed-through to WPS provider, such as Authentication or Tokens that needs to be passed to the WPS provider for the task to run successfully. A .ini entry such as weaver.wps_client_headers_whitelist should be added.

Sounds good. Do you know of any other headers than Host that could be a problem? All known problematic headers could be provided by default in a filtering list. One issue I can foresee with the whitelist approach is that some extra headers that could be relevant might get lost, such as X-WPS-Output-Context.

Uhmm... regex whitelist? '^(X-WPS-.*|Authorization)$' ?

I'm more concerned about over-filtering headers and breaking some obscure HTTP functionality. Removing headers that are known to cause problems is "safer" IMO.

Weaver can add its own HTTP headers when required.
Since Weaver is a workflow manager rather than an HTTP proxy, the HTTP headers shouldn't be blindly passed from a client to a provider.

In the case of WPS providers, it somewhat acts as a proxy with a conversion layer on top for WPS <-> OGC API - Processes representation. In this case, the cookies and auth details must be passed down to the provider service. Since Weaver cannot know which headers are intended for the downstream WPS, it currently passes all of them. We can however filter out the ones that are known to cause problems like Host.

Ok then, would weaver.wps_client_headers_denylist be a good wording for the option?

Yes. And it would have Host only by default to start with.