TeamHG-Memex/scrapy-rotating-proxies

This middleware breaks default throttling

3hhh opened this issue · 0 comments

3hhh commented

You set a per-proxy download_slot at [1].

Essentially that means:
Throttling works per-proxy (cf. [2], same for 'DOWNLOAD_DELAY` et al) and not per destination host anymore (that would be the default).

Since most users will have >> 100 proxies, you'll hammer the target host with >> 100 requests at once.
So the user can be nice to his proxy provider, but not to the destination host.

[1] https://github.com/TeamHG-Memex/scrapy-rotating-proxies/blob/master/rotating_proxies/middlewares.py#L146
[2] https://github.com/scrapy/scrapy/blob/master/scrapy/extensions/throttle.py