innolitics/rdm

Cache Github History

Opened this issue · 2 comments

Our GitHub-based project management backends require quite a few GitHub API hits. As a result, git pull can be slow and it is possible to run into the GitHub API limit.

  • Research various methods to cache the results
  • Implement the best solution

Ideally the solution would only pull down issues or pull requests which have changed since they were last pulled.

Here is an example stack dump from hitting the rate limit error:
Traceback (most recent call last): File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/main.py", line 20, in main exit_code = cli(sys.argv[1:]) File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/main.py", line 42, in cli pull_from_project_manager(args.config) File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/pull.py", line 17, in pull_from_project_manager development_history = pm_backend.pull() File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 50, in pull return _format_development_history(self.config, issues, pull_requests) File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 71, in _format_development_history changes = [build_change(config, pr) for pr in pull_requests if _is_change(pr)] File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 71, in <listcomp> changes = [build_change(config, pr) for pr in pull_requests if _is_change(pr)] File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 132, in build_change approvals = change_approvals(config, pull_request) File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 194, in change_approvals github_reviews = [r for r in pull_request.get_reviews()] File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/rdm/project_management/github.py", line 194, in <listcomp> github_reviews = [r for r in pull_request.get_reviews()] File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/github/PaginatedList.py", line 59, in __iter__ newElements = self._grow() File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/github/PaginatedList.py", line 71, in _grow newElements = self._fetchNextPage() File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/github/PaginatedList.py", line 204, in _fetchNextPage "GET", self.__nextUrl, parameters=self.__nextParams, headers=self.__headers File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/github/Requester.py", line 319, in requestJsonAndCheck verb, url, parameters, headers, input, self.__customConnection(url) File "/home/willy/code/venv/pc_reg/lib/python3.6/site-packages/github/Requester.py", line 342, in __check raise self.__createException(status, responseHeaders, output) github.GithubException.RateLimitExceededException: 403 {"message": "API rate limit exceeded for user ID 890550.", "documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"}

Ugh! We should also catch this error and print something more useful than a huge stack trace!