TeamHG-Memex/scrapy-rotating-proxies

Banned responses get to the engine in the end

3hhh opened this issue · 0 comments

3hhh commented

If a response is identified as a ban by response_is_ban(self, request, response), it'll currently reach the spider's parse() method after the final retry attempt by your middleware, because you don't raise an exception or otherwise stop the response after more than ROTATING_PROXY_PAGE_RETRY_TIMES banned attempts.
This is somewhat inconvenient as it requires the user to call response_is_ban(self, request, response) again in his parse() implementation.

Apart from that I also noticed that ROTATING_PROXY_PAGE_RETRY_TIMES = 1 generally results in 2 retries rather than just 1 (it's always 1 more than ROTATING_PROXY_PAGE_RETRY_TIMES).