hyperledger-cacti/cacti

ci(github): commit parity check to ignore dependabot pull requests

Closed this issue · 3 comments

Description

Dependabot pull requests are failing the PR commit parity check and the only way to fix them is by manually editing every pull request the bots make which is a drain on our resources and I'm not sure it's worth it for this particular category of pull requests. Here's why:

The dependabot pull requests are always the same in the sense that they are just bumping dependences so their complexity remains very low, forever.

With this in mind the easiest shortcut for us right now is to just add a configuration option to the commit parity check which allows us to tell it to ignore pull requests from certain authors (and for now the one item we'll put on that list is dependabot).

Acceptance Criteria

  1. It is possible to extend the list of bot authors later without having to change source code of the check itself (env variables are a great way to introduce this configurability through a CVS such as CACTI_CUSTOM_CHECK_PR_COMMIT_PARITY_EXEMPTED_AUTHORS=dependabot,some-other-bot-1,yet_another_bot_2,etc.
  2. The exempted authors get passed for the commit parity check.
  3. The check is still very much in effect for everyone else (humans) who are making more complex contributions which require that we make sure the commit message contains the full context that the PR description had in it.

Hi @petermetz
If we have a workflow level conditional statement to see if the PR creator is a bot or not, then it would be the best as we can see the skipped status of the workflow
However, there is a limitation of github workflow, that we cannot use env vars in conditional statements:
image

Setting up as a part of github context is resource intensive (as we need to run a job to set it up)
And if we setup it as a part of vars context, then that would have to maintained as a part of repository variables under Actions and Secrets section of the repo settings.

Can we do something like this (less elegant, but least resource intensive way imo)?

image

(My main focus point here is that having a github workflow level conditional statement would be the most informative and least resource extensive way of doing this)

@jagpreetsinghsasan Agreed on all counts. I think the hardcoded bot names are a good trade-off for now we only have dependabot on that list of bots anyway (that I know of).

I also agree very much that the resource usage of it is already too high, ideally a baseline check like this wouldn't even need to run yarn install nor yarn configure and be powered by a much smaller footprint. I know that we need the install because of that string similarity library I suggested to use but maybe we can have the robots write a quick and naive implementation of string similarity that is good enough, something like this:

https://chatgpt.com/share/40999856-36eb-4f41-a841-2e0d6af37eca

If we could make it so that the commit parity check is just a git clone ... && node ./tools/check-commit-parity.js then it would save a lot on resources and also provide rapid feedback to people while their mind is fresh on the task of having opened the PR (if we need 5-10 minutes to tell them that the check has failed they probably already closed the browser tab and moved on to some other task)

@jagpreetsinghsasan Agreed on all counts. I think the hardcoded bot names are a good trade-off for now we only have dependabot on that list of bots anyway (that I know of).

I also agree very much that the resource usage of it is already too high, ideally a baseline check like this wouldn't even need to run yarn install nor yarn configure and be powered by a much smaller footprint. I know that we need the install because of that string similarity library I suggested to use but maybe we can have the robots write a quick and naive implementation of string similarity that is good enough, something like this:

https://chatgpt.com/share/40999856-36eb-4f41-a841-2e0d6af37eca

If we could make it so that the commit parity check is just a git clone ... && node ./tools/check-commit-parity.js then it would save a lot on resources and also provide rapid feedback to people while their mind is fresh on the task of having opened the PR (if we need 5-10 minutes to tell them that the check has failed they probably already closed the browser tab and moved on to some other task)

@petermetz I think the library we used also uses the same algorithm in the backend. True that the dependencies should not exist otherwise devs will just leave it as it is and move on the other tasks. I will implement the same (I haven't tried using LLMs outside generating images and posters, this is the first time I am seeing it code XD)