Savannah OSCI
vlad-isayko opened this issue · 0 comments
vlad-isayko commented
The goal is to create and automate analysis of repos hosted on Savannah (https://savannah.gnu.org/). This would be similar to our existing OSCI ranking which analyses repos hosted on GitHub, with a focus on the activity by commercial organizations.
- Solution that crawls data about push events commits (PEC) that should contain the following required fields:
- event creation date;
- commit author (email address, name);
- SHA.
- Adapt existing pipeline to process Savannah data.
We did a high-level technical analysis on the feasability of making an OSCI for repos hosted on Savannah. This is a summary of our findings:
Criteria | Status (Yes/No) | Notes (e.g. about how it is possible, or limitations, etc) |
---|---|---|
Is this site free to use for open source projects? | yes | |
Does it look like this site hosts many open source projects? | yes | In total (all projects): "23990 registered users, 3829 hosted projects" |
Is there a public API we can query? | no | however, we can parse HTML pages |
API type | - | |
API URL | - | |
Query Limits (if any) | - | |
Is there a paid access with more information? | - | |
Is it possible to query the project license? | Yes | by parsing HTML page |
Is it possible to query commit events/commit counts by a user in a time period? | no | |
Is it possible to query email address or else some organization information for the person making a commit? | Yes | by parsing HTML pageemail address |
Is there a public archive we can use instead of the public API? | no | |
Any additional Information worth knowing? | yes | it is possible to get information about commits with using parsing web pages if project is based on GIT |