Modify Response object
Opened this issue · 1 comments
Thank you for creating such an amazing package! My goal is to retrieve only the intercepted json_data
without the full HTML page content. Is there a way to set intercepted json_data
as body to Response
object, that received in the parse
callback?
def start_requests(self):
url = "https://littlecaesars.com/en-us/order/pickup/stores/search/75215/"
yield scrapy.Request(url, callback=self.parse, meta={
"playwright": True,
"playwright_page_methods": [
PageMethod("route", "**/api/GetClosestStores", self.capture_request),
PageMethod("wait_for_selector", "//button[contains(text(), 'Start your order')]"),
]
})
async def capture_request(self, route: Route):
response = await route.fetch()
json_data = await response.json()
await route.fulfill(response=response, json=json_data)
def parse(self, response: Response):
pass
( Due to reCAPTCHA protection, using Playwright
is essential here )
I'd suggest not to set a custom route, it's usually not a good idea given all the processing done in the handler. I think you can do something similar to this, i.e. intercepting the "response" event, extracting what you need, storing it somewhere temporarily, retrieving it in the main response callback and returning it there. You might need something like an asyncio.Event if you ran into synchronization issues e.g. if the callback runs before the response event handler (though I'm just thinking out loud, not sure that's even an actual possibility).