This repository is used to save the progress of the Fall 2021 Network Analysis Project. The goal of this project was to build a recommendation system for github repositories, based on the 2009 GitHub bipartite network, representing User-Repository-Watches (unweighted) relationships. A similar pipeline as used within this project can be applied generically to any bipartite graph, to build a recommendation for either of the two classes.
While the focus of this project was on understanding, applying and argue in favor or against theoretical concepts of network analysis using real-life data, the final product - a recommendation graph (that was statically saved for a max of 5 recommendations) - was translated into a simple web-app. Below, you can find all the important resources:
Ludek Cizinsky |
Jonas-Mika Senghaas |
Lukas Rasocha |
The project was (excluding the web-app) entirely written in Python
and makes use of several external libraries. If you want to run the scripts and notebooks yourself, it is recommened to create a virtual environment (using the python environment manager) of you choice and then install all dependencies of it from the requirements.txt
. Follow the following steps for create a stadard python venv
:
-
Create a virtual environemnt:
python3 -m venv [name of venv]
-
Activate the venv using:
source [name of env]/bin/activate
. You can decativate using the commanddeactivate
-
Update
pip
:pip install --upgrade pip
-
Install all dependencies:
pip install -r <path/to/requirement.txt>
We use the public GitHub API to dynamically query additional metadata to be displayed in our web-app and to gain additional metadata for our recommendation system. While the API has free access, using an API token increases the number of API requests to 5000/hour. If you want to run the scripts yourself, you must create your own free, GitHub API token. Follow the following steps:
-
Visit your personal GitHub Token Settings
-
Create a new personal token and set the scope to
public_repo
. -
Copy the token and save it into a python file in the following location
csripts/envvars.py
. Format your file as follows:GITHUB=TOKEN=[your_token]
-
The file is automatically ignored by
.gitignore
.