Pass Context into Request Middleware
JakeOcean opened this issue · 3 comments
Is there any way to Pass Context into Request Middleware or an ItemProcessor?
Within an Item Processor, if you need a bit of context from the Spider before you can save the Item to the Database, there seems to be no way to access any Meta data, or the Request/Response objects
Any reasons you can’t put that meta data on the item itself before yielding it?
Closing due to inactivity. Feel free to reopen if you have more information.
Doing this will pass the context parameters passed external to the spider to the ItemProcessor
https://roach-php.dev/docs/spiders/#passing-additional-context-to-spiders
public function parse(Response $response): Generator
{
$userAgent = $this->context['userAgent'];
yield $this->item([
'userAgent' => $userAgent, // This will be passed to ItemProcessor
'url' => $response->getUri(),
]);
}
https://roach-php.dev/docs/item-pipeline/#making-processors-configurable
Don't know if it's a good way, but it can be done.
It is unclear whether the built-in middleware can directly obtain the header information of external requests.