InnerSourceCommons/InnerSourceLearningPath

Website build is failing

rrrutledge opened this issue · 7 comments

The error causing the build failure is ye olde Rate Limit error from our calls to GraphQL, which, were supposed to have been fixed with the introduction of the throttling plugin.

I was unable to reproduce the same error locally. So, re-ran the action in GitHub with debugging enabled, to see if any new information gets logged when it errors out. However, the action completed successfully!

Looking back through action history, the only thing I can think of that might lead to this being a re-occurring issue is when multiple build actions are kicked off in a very short amount of time?? For example, on Aug 1st there were 4 failures in a row. But, there were multiple triggering commits from @tsadler1988 (2nd commit 2 minutes from the 1st) and @nysenthil (3x commits each 1 min apart).

Unfortunately, the older failures from 4 months back have expired logs so I can't see that Rate Limit errors were the issue with those or not.

With this as my working theory I manually re-ran the last 4 failed builds all together to see if I could reproduce it that way, and it worked! 2 of the builds that were both simultaneously making GraphQL calls hit the Rate Limit error and failed. If you want to try to reproduce this yourself, re-run the Publish To Website actions for:

  • Update Links in Translations #97
  • Merge pull request #570 from InnerSourceCommons/tc-pt-br-01 #98

The remaining builds failed due to code conflicts and never made it to the GraphQL phase.

Now I am reading up on GitHub actions, to see if there is any way to set them up to have multiple triggers queue and run the workflow sequentially rather than simultaneously.

This is great investigation, @marshmallowrobot!

Reopening since we just saw the build fail again with @tsadler1988's "Update Links in Translations" run.

Did some more digging on rate limits and have learned the following:

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository. 

Source: https://docs.github.com/en/rest/overview/resources-in-the-rest-api?apiVersion=2022-11-28#rate-limits-for-requests-from-github-actions

We now have ~186 *.asciidoc files that each require 3 calls to the GitHub API. A single build makes 558 API calls. With the current set up, we can only run the build once per hour. Any more than that will result in Rate Limit errors.

I feel a bit gaslit here? I'm SURE I ran the build script more than twice in an hour while I was testing throttling API calls... The throttling fix went live on Nov 21st, 2022. It appears this rate limit change went live on Nov 28th, 2022. We had only one week of build bliss!

I guess we have options:

  • DO NOTHING. Just stop merging so often.
  • MANUAL BUILDS ONLY. Remove the push trigger. Merge all day, build only when you need it.
  • SCHEDULED BUILDS ONLY. Check once an hour for work to do. Scheduled workflows can bloat action run history even if no actual steps have been done.
  • ALTERNATIVE TO API CALLS. Find another way to generate contributor info for articles.
  • Does our Node library support throttling/retry?

This may become an issue again if activity increases, but for now it will not prevent us building to the website as and when required.