iipc/jwarc

disable serviceworker in replay proxy mode

sberequek opened this issue · 3 comments

Hi,

when running jwarc as a replay proxy is there a way to disable the serviceworker script injection?
Looking at the source code in the WarcServer class I would like to know if it was possible to add a parameter in get request for the "replay" which allows to change the value of the "proxy" argument. Currently the replay method is call always with "proxy" at false (line 112).

Thanks

ato commented

Proxy requests are handled by the proxy(HttpExchange) method which calls replay() with the proxy argument set to true.

replay(exchange, exchange.param(0), date, true);

Proxy requests can be distinguished from normal requests by exchage.request().target() being an absolute URL (currently this is done just by proxy being the default fallthrough route).

Note that jwarc's WarcServer hasn't been well tested and lacks important features like a proper index and date-selection UI in proxy mode. It's more of a proof of concept / demo. I would currently recommend pywb's proxy mode instead for most users.

ato commented

Demonstration that script injection doesn't happen when used in proxy mode:

$ jwarc fetch http://www.example.org/ > /tmp/example.warc
$ jwarc serve /tmp/example.warc &
Listening on port 8080
$ curl --proxy http://localhost:8080 http://www.example.org/
<!doctype html>
<html>
<head>
    <title>Example Domain</title>

But does when used in normal replay mode:

$ curl http://localhost:8080/replay/20230101000000/http://www.example.org/
<!doctype html><script src='/__jwarc__/inject.js'></script>

Thanks @ato,

I fixed it, perfect thanks for the tips.