/Rich-Club-Behavior

Script to calculate rich-club coefficients from input graph, using as source Github data.

Primary LanguageJupyter Notebook

Rich Club Behavior

Scripts to calculate rich-club coefficients are included, as well as to generate the supergraphs from source interaction graphs (Github activities: commits, PR, issues).

More details in the paper: "Analyzing Rich-Club Behavior in Open Source Projects". The complete dataset to reproduce the results is available here: https://doi.org/10.7910/DVN/AA4IIS

In order to use the code, please install all the dependencies:

  • Main dependencies are defined in requirements.txt file
  • SNAP library for graph manipulation can be installed following instructions from its website

After dependencies installation, gather the complete data and follow these instructions:

  1. Run the script to generate supergraphs: python generate_supergraphs.py
  2. Run the script to calculate the rich-club coefficient for a specific set of graphs. Parameters:
    • --graph: select the source graphs for coefficient calculation. Available values: G (supergraph); i (issues); p (pull-requests); c (commits)
    • --N: number of graphs for which run the calculation (based on analyzed projects in the paper, refer to project.csv for the complete list)

Example: python richclubcoefficient.py --graph G --N 10

computes the coefficient for the supergraphs for the first 10 projects.

Note that computation of coefficients is not always successfull (as stated in the paper), and it can take much time for big graphs.

The first 10 supergraphs are provided as examples in this repository.