Large Response Support
trieloff opened this issue · 6 comments
With the wrapper and universal gateway in place, we can now support large responses in the following way:
- The wrapper detects that the response size exceeds a certain limit
- The wrapper stores the response in an S3 bucket or equivalent storage – ideally in a manner that does not require additional dependencies that need to be packaged or credentials that need to be configured
- The wrapper returns a 307 status with
Location
pointing to the stored response body - The gateway intercepts the 307 status,
RESTART
s and delivers the body from theLocation
- The wrapper or some asynchrone job cleans up the response body storage
The response cleanup could also be done by the wrapper in the next request.
i would rather keep it simple and just deliver things normally. it is easy to split sitemaps up to an arbitrary limit for the edge cases where that's needed. we definitely shouldn't make things more complicated generally, for an edge case of an edge case...
ha! ...i just realized it is an edge case (large text/xml sitemap) of an edge case (sitemaps in general) of an edge case (running on serverless infrastructure with an unreasonably small response payload limit)
@tripodsan that could be a way of serving large sitemaps: split them up automatically and allow serving fragments of sitemaps from helix-content-proxy.
@tripodsan that could be a way of serving large sitemaps: split them up automatically and allow serving fragments of sitemaps from helix-content-proxy.
how can sitemaps by split up ?
https://developers.google.com/search/docs/advanced/sitemaps/large-sitemaps
ok, we can also serve it a *.gz
:
<loc>http://www.example.com/sitemap1.xml.gz</loc>
Could we serve the original sitemap as .gz
?