Scrapy template: install_reactor can be replaced by configuration
Closed this issue ยท 2 comments
I just learned that
from scrapy.utils.reactor import install_reactor
install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor")
can be replaced by
TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
(at least per scrapy-playwright's docs). Since the template already modifies the settings through apply_apify_settings()
, change of the reactor could be done there, in a declarative way, instead of imperative import and a function call.
This is just a suggestion for improvement, I have no idea what are the design decisions behind the current design and it's very well possible that the two approaches are not comparable or have the same outcome.
Hi, thanks for the suggestion!
I vaguely remember that just setting the TWISTED_REACTOR
setting didn't work correctly, because we have to configure AsyncioSelectorReactor
and nest_asyncio
in the correct order (install Reactor first, then apply nest_asyncio
), otherwise Scrapy Actors don't run on Windows correctly.
Unfortunately, the whole Scrapy <-> asyncio
integration is a mess, which is understandable given that Scrapy and Twisted started out way before asyncio
was a thing, and then the asyncio
integration was just shoehorned in. There are several hacks to make the Apify SDK work with Scrapy, with separate event loops for request queues and so on, it's pretty ugly ๐ฌ
I'll close this issue, as the code has to be the way it is, but I'll add a comment to the template explaining the situation better.
Thanks anyway, we're happy for these kinds of suggestions! ๐