/github-network-analysis

Analysis and visualization of top GitHub repositories

Primary LanguagePython

GitHub Repository Collaboration Network Analysis

An analysis and visualization of collaboration between top GitHub repositories, focused on the relationship between programming languages used and the network structure.

Interactive visualization:

More information and the full analysis: PDF report

Credits

General workflow

  • From Google BigQuery:
    • repo-attributes.sql creates repo-attributes.csv
    • repo-weights.sql creates repo-weights.csv
  • The process.py script reads both .csv files and creates repositories.gml
  • Gephi loads repositories.gml and creates:
  • Python code within analysis-*.texw reads repositories.gml and produces output for the report