crissyfield/repo-lookout

Emails are sent too late

Closed this issue · 1 comments

tja commented

In the current workflow, we start with a quick heuristic to see if a Git repository might be exposed, followed by a more reliable but slower Git repository scan to read the last 5 commit history entries, and finally the actual sending of an email report.

Because we send emails slower than we discover Git repositories, the backlog between the second and third steps has been growing. As of today (May 25th), we are still sending out emails with information collected over two months ago (March 23rd). Of course, some of this information may be outdated and the Git repository may no longer be exposed.

To avoid sending out emails and bothering people unnecessarily, we need to change the logic of the workflow by growing the backlog between the first and second step (which is fine) and restricting the backlog between the second and third step.

tja commented

This has been implemented by adding a max-email-tasks setting to the inspect (to read the Git commit history) and lookup (to collect contact information) commands. Setting it to N will cause an early exit if the database already has more than N open email tasks.