spider: impossible to setup grab transport
ingvarbiz opened this issue · 2 comments
ingvarbiz commented
I want to configure CURLOPT_RESOLVE to specific IP address, so in create_grab_instance() I wrote:
...
g.setup_transport('pycurl')
g.transport.curl.setopt(pycurl.RESOLVE, ['api.somesite.com:443:{}'.format(ip)])
return g
When I call spider.run(), I get the following error:
ERROR:grab.spider.base_service:Spider Service Fatal Error
Traceback (most recent call last):
File "/home/jetscraper/jet/lib/python3.5/site-packages/grab/spider/base_service.py", line 32, in wrapper
callback(*args, **kwargs)
File "/home/jetscraper/jet/lib/python3.5/site-packages/grab/spider/network_service/multicurl.py", line 159, in spawner_callback
grab = self.spider.setup_grab_for_task(task)
File "/home/jetscraper/jet/lib/python3.5/site-packages/grab/spider/base.py", line 553, in setup_grab_for_task
grab.setup_transport(self.grab_transport_name)
File "/home/jetscraper/jet/lib/python3.5/site-packages/grab/base.py", line 253, in setup_transport
'Transport is already set up. Use'
grab.error.GrabMisuseError: Transport is already set up. Use setup_transport(..., reset=True) to explicitly setup new transport
Traceback (most recent call last):
File "jet_get_dxm.py", line 119, in <module>
scraper.run()
File "/home/jetscraper/jet/lib/python3.5/site-packages/grab/spider/base.py", line 689, in run
raise exc_info[1]
The problem is in grab/spider/base.py", line 553, in setup_grab_for_task:
grab.setup_transport(self.grab_transport_name)
So I had to comment out this string. Is it possible to configure CURLOPT_RESOLVE somewhere else?
lorien commented
Use update_grab_instance
ingvarbiz commented
I have tried it too before issue submission, and it doesn't work either. It's being called before setup_transport function.
def setup_grab_for_task(self, task):
grab = self.create_grab_instance()
if task.grab_config:
grab.load_config(task.grab_config)
else:
grab.setup(url=task.url)
# Generate new common headers
grab.config['common_headers'] = grab.common_headers()
self.update_grab_instance(grab)
grab.setup_transport(self.grab_transport_name)
return grab
Same error here:
Spider Service Fatal Error
Traceback (most recent call last):
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/base_service.py", line 32, in wrapper
callback(*args, **kwargs)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/network_service/multicurl.py", line 159, in spawner_callback
grab = self.spider.setup_grab_for_task(task)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/base.py", line 553, in setup_grab_for_task
grab.setup_transport(self.grab_transport_name)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/base.py", line 253, in setup_transport
'Transport is already set up. Use'
grab.error.GrabMisuseError: Transport is already set up. Use setup_transport(..., reset=True) to explicitly setup new transport
%Traceback (most recent call last):
File "t2.py", line 93, in <module>
s.run()
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/base.py", line 689, in run
raise exc_info[1]
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/base_service.py", line 32, in wrapper
callback(*args, **kwargs)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/network_service/multicurl.py", line 159, in spawner_callback
grab = self.spider.setup_grab_for_task(task)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/spider/base.py", line 553, in setup_grab_for_task
grab.setup_transport(self.grab_transport_name)
File "/home/jetscraper/tests/lib/python3.5/site-packages/grab/base.py", line 253, in setup_transport
'Transport is already set up. Use'
grab.error.GrabMisuseError: Transport is already set up. Use setup_transport(..., reset=True) to explicitly setup new transport