Add option for a delay between request
XiangRongLin opened this issue · 2 comments
Feature description
In the scenario that multiple urls are passed in, I want to be able to specify a delay between the request to the website.
My usecase would be downloading all chapters from a table of contents, where I image that I would quickly get blocked if hundreds of requests are sent as fast as possible.
Existing workarounds
Is there any way to obtain the desired effect with the current functionality?
Not that I know of, because I want the output to be combined into a single epub file.
Hi @XiangRongLin, thank you for the report. In general, I've avoided implementing options for fetching pages, since that opens up a whole new dimension of configuration (do we support delays / parallelism? proxies? authentication headers? etc.). Instead you are able to use a combination of -
and --url
to offload the responsibility to a separate program (eg. curl
) as below:
curl https://example.com | percollate pdf - --url=https://example.com
For bundling multiple pages into a single EPUB, the workaround is admittedly a bit convoluted:
- fetch each page using
curl
and feed it topercollate html
with the-
operand and the--url
option, using your desired parallelism and delay between requests. - feed all local HTML pages to
percollate epub
.
It might make sense to introduce an option to control parallelism and delay, such as:
percollate epub --wait=N url1 url2 ...
When --wait
is supplied, percollate could switch from fetching in parallel to fetching sequentially, with a delay of N
seconds between requests.
The --wait
option has been published in percollate@2.2.0
.