scrapinghub/python-scrapinghub

Add retry logic to Job Tag Update function

ftadao opened this issue · 0 comments

Description

An Internal Server Error pops up whenever a large number of tag updates run parallel or sequentially.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/project-1.0-py3.10.egg/XX/utils/workflow/__init__.py", line 930, in run
    start_stage, active_stage, ran_stages = self.setup_continuation(
  File "/usr/local/lib/python3.10/site-packages/project-1.0-py3.10.egg/XX/utils/workflow/__init__.py", line 667, in setup_continuation
    self._discard_jobs(start_stage, ran_stages)
  File "/usr/local/lib/python3.10/site-packages/project-1.0-py3.10.egg/XX/utils/workflow/__init__.py", line 705, in _discard_jobs
    self.get_job(jobinfo["key"]).update_tags(
  File "/usr/local/lib/python3.10/site-packages/scrapinghub/client/[jobs.py](http://jobs.py/)", line 503, in update_tags
    self._client._connection._post('jobs_update', 'json', params)
  File "/usr/local/lib/python3.10/site-packages/scrapinghub/[legacy.py](http://legacy.py/)", line 120, in _post
    return self._request(url, params, headers, format, raw, files)
  File "/usr/local/lib/python3.10/site-packages/scrapinghub/client/[exceptions.py](http://exceptions.py/)", line 98, in wrapped
    raise ServerError(http_error=exc)
scrapinghub.client.exceptions.ServerError: Internal server error

This is not a problem if you are doing updates for a couple of jobs, but if you want to mass update this error will pop up eventually.

Adding adaptable retry logic to the update_tags function through that ServerError exception would make it easier to debug and implement large-scale workflows.