epam/OSCI

Savannah OSCI

vlad-isayko opened this issue · 0 comments

The goal is to create and automate analysis of repos hosted on Savannah (https://savannah.gnu.org/). This would be similar to our existing OSCI ranking which analyses repos hosted on GitHub, with a focus on the activity by commercial organizations.

  1. Solution that crawls data about push events commits (PEC) that should contain the following required fields:
    • event creation date;
    • commit author (email address, name);
    • SHA.
  2. Adapt existing pipeline to process Savannah data.

We did a high-level technical analysis on the feasability of making an OSCI for repos hosted on Savannah. This is a summary of our findings:

Criteria Status (Yes/No) Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects? yes  
Does it look like this site hosts many open source projects? yes In total (all projects): "23990 registered users, 3829 hosted projects"
Is there a public API we can query? no however, we can parse HTML pages
API type -  
API URL -  
Query Limits (if any) -  
Is there a paid access with more information? -  
Is it possible to query the project license? Yes by parsing HTML page
Is it possible to query commit events/commit counts by a user in a time period? no  
Is it possible to query email address or else some organization information for the person making a commit? Yes by parsing HTML pageemail address
Is there a public archive we can use instead of the public API? no  
Any additional Information worth knowing? yes it is possible to get information about commits with using parsing web pages if project is based on GIT