proxy.py
is a simple web proxy that passes requests and data between a web client and a web server. The proxy is also able to do:
- Telemetry
- Image Substitution
- Attack mode
To run the proxy, use python3 ./proxy.py <port> <image-flag> <attack-flag>
- To print the help page, use
python3 ./proxy.py -h
- Preferably,
python 3.10.6
is being used, but it has been tested withpython 3.8.10
in xcne server too. - To enable logging, just comment
line 333
of the code.
Clarification:
- This proxy only supports
Content-Length
based reading for the request, therefore if you use chunked encoding , the body of the request won't be read if there is noContent-Length
in the header too
These are the assumptions that I made during this assignment:
- The telemetry is distinguished using the
(referer, user-agent)
key. Therefore, if there are two browser sessions opening the same tabs, the telemetry will be combined for both sessions. For the initial webpage that doesn't havereferer
, it will use the initial resource URL. - To determine that all the
GET
request from one browser session is done, for each request coming from the same(host, port)
source, the proxy will wait for 5s (purely based on heuristic) from the last request. If there isn't any new requests, the telemetry will be outputted. Therefore, if there are requests that are lagging so bad, it is possible that it will be outputted under different telemetry. - For Image Sub, I am counting the new image for the telemetry (i.e.
/change.jpg
) instead of the original image. Similarly, for the Attack mode, I am counting the artificial returned response that I have made into the telemetry instead of the original resource sizes. - Only resource that is successfully fetched (i.e.
HTTP
response code200
) would be counted towards telemetry.
- To determine whether the request is an image, I try to match it with common image file extensions such as
.jpeg, .jpg, .png, .svg, .gif
, etc. Moreover, I also check that if there are files that are requested and not substituted with./change.jpg
, but the response hasContent-Type
of an image, I will consider it as an image too. This provides extra layer of protection to ensure all images are being subbed. However, there are still loopholes ifContent-Type
is either wrong or not present and the extension of the file requested is not one of the common image files.