sourcerer-io/sourcerer-app

Ignore HUGE commits

AlexisTM opened this issue · 0 comments

Hi there, it happens from time to time to do a repo copy (such as GoogleCode->Github) with no commit kept or have a library that has to be copied following the company requirements on third-parties.

I would propose to filter these out by not using any commit of more than "100k lines" for example. That should be enough to cover huge pull requests squashed and filter out external libraries. In my case, I wish to filter out: AlexisTM/RT-WiFi which accounts for 2M lines, given to me but not written by me.

BTW, a new fun fact of "average commit size" or "median commit size" or the trend of commit size (number of commits for each category: 0-10, 11-100, 101-1k, 1k1-10k, 10+ lines or any other logarithmic base)