scrapinghub/sample-projects

requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://storage.scrapinghub.com/collections/233792/s/nikoncoolpix

Opened this issue · 0 comments

Following the Scrapy Price Monitor I encountered an error after successfully deploying project to Scrapy Cloud. Running for example amazon.com spider job, it is completed with 0 items and 5 errors (1 for each 'product name'). In job log I get (for 'product_name': 'nikoncoolpix'):
[scrapy.core.scraper] Error processing {'retailer': 'amazon.com', 'product_name': 'nikoncoolpix', 'when': '2017/09/07 03:57:21', 'price': 256.95, 'title': 'Nikon COOLPIX B500 Digital Camera (Red)', 'url': 'https://www.amazon.com/Nikon-COOLPIX-B500-Digital-Camera/dp/B01C3LEE9G'}

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/twisted/internet/defer.py", line 588, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/app/__main__.egg/price_monitor/pipelines.py", line 20, in process_item
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/collectionsrt.py", line 152, in set
    return self._collections.set(self.coltype, self.colname, *args, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/collectionsrt.py", line 56, in set
    return self.apipost((_type, _name), is_idempotent=True, jl=_values)
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/resourcetype.py", line 74, in apipost
    return self.apirequest(_path, method='POST', **kwargs)
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/resourcetype.py", line 71, in apirequest
    return jldecode(self._iter_lines(_path, **kwargs))
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/resourcetype.py", line 60, in _iter_lines
    r = self.client.request(**kwargs)
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/client.py", line 107, in request
    return self.retrier.call(invoke_request)
  File "/usr/local/lib/python3.5/site-packages/retrying.py", line 206, in call
    return attempt.get(self._wrap_exception)
  File "/usr/local/lib/python3.5/site-packages/retrying.py", line 247, in get
    six.reraise(self.value[0], self.value[1], self.value[2])
  File "/usr/local/lib/python3.5/site-packages/six.py", line 686, in reraise
    raise value
  File "/usr/local/lib/python3.5/site-packages/retrying.py", line 200, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/usr/local/lib/python3.5/site-packages/scrapinghub/hubstorage/client.py", line 100, in invoke_request
    r.raise_for_status()
  File "/usr/local/lib/python3.5/site-packages/requests/models.py", line 844, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://storage.scrapinghub.com/collections/233792/s/nikoncoolpix

Same error is encountered when running spider in local environment. I would really appreciate any help.

System specifications:

  • OS Windows 10
  • Python 3.6.1.
  • Scrapy 1.4.0.