/githump

Loot email addresses from commit entries

Primary LanguageShellDo What The F*ck You Want To Public LicenseWTFPL

githump

Simple bash script that loots email addresses from commit entries. Email addresses are set via user config when pushing changes up to github. Running git log against a repository shows a list of commits, from which email addresses can be parsed. The githump script enumerates all repositories for a target organization or user and then extracts email addresses from the commit logs of each repository. Finally, all unique emails are extracted from the intermediary results and saved off in the results directory.

Usage

Usage is easy: ./githump.sh <target> where <target> is the github account username. For example, ./githump.sh SalesforceEng to target everything at https://github.com/SalesforceEng.

TODO

Future improvements to be enumerated here.

Historical Aggregation

This is more of a documentation and calling issue. Write up instructions for running as a repeated task and collecting email addresses historically. This could be done by pushing result commits up to bitbucket or another version control repository.

New Results Only

The JSON results from the github API include an updated_at key-value pair. If running githump daily and aggregating results, a check should be added to only clone and search repositories that have been updated since the previous run.