Whitenoise does not limit reads beyond the end of a range request
rtibbles opened this issue · 2 comments
First off - thank you for Whitenoise, it is an incredible piece of software!
I have noticed that when sending a range request to whitenoise, if there is a complete range, e.g. Range: bytes=0-1
the file handle that whitenoise returns does not prevent reading beyond the end of the specified range.
This seems to produce unpredictable results with different WSGI servers because of an ambiguity in the WSGI specification about how the file wrapper extension should handle Content-Length
.
When running whitenoise with cheroot, making a terminated range request such as Range: bytes=0-1
gives a 500 error because the file handle returned to the file wrapper produces data beyond the end of the range requested.
When running with Bjoern the request does complete, but clients seem to disconnect before the request has completed, giving 104 errors.
I am working around this in Kolibri by patching Whitenoise to return a file like object when a range request is made. The file like object is something like this:
from io import BufferedIOBase
class SlicedFile(BufferedIOBase):
def __init__(self, fileobj, start, end):
fileobj.seek(start)
self.fileobj = fileobj
self.remaining = end - start + 1
def read(self, size=-1):
if self.remaining <= 0:
return b""
if size >= 0:
size = min(size, self.remaining)
data = self.fileobj.read(size)
self.remaining -= size
return data
with the following change to get_range_response
:
def get_range_response(self, range_header, base_headers, file_handle):
headers = []
for item in base_headers:
if item[0] == "Content-Length":
size = int(item[1])
else:
headers.append(item)
start, end = self.get_byte_range(range_header, size)
if start >= end:
return self.get_range_not_satisfiable_response(file_handle, size)
if file_handle is not None:
file_handle = SlicedFile(file_handle, start, end)
headers.append(("Content-Range", "bytes {}-{}/{}".format(start, end, size)))
headers.append(("Content-Length", str(end - start + 1)))
return Response(HTTPStatus.PARTIAL_CONTENT, headers, file_handle)
I am happy to implement this and make a pull request to whitenoise if this would be useful/acceptable, although I am not currently familiar enough with whitenoise's test suite to know how to test this.
This seems like a very reasonable fix! Thank you for the code snippets.
I'm just starting to help with the maintenance of Whitenoise. If you want to start a PR and try figure out what you can for testing, I can help finish it off.
Hi @adamchainz - great, and we've you to thank for suggesting we switch Kolibri to use whitenoise in the first place! I'll try to submit a PR before the end of the week.