Hollywood Network Centrality
Executive Summary
In graph networks, it is often desired to find the most central figure in the group. However, there are many choices of how to define and calculate centrality. Using the Hollywood network of actors and recent films as an example network we give a detailed discussion on seven different options, letting you know when to choose and how to interpret each centrality score. Finally, we also discuss how incorporating edge weights can change these calculations. In Hollywood, we came to the conclusion that Charlize Theron is the most central figure in recent years by a variety of the centrality types.
Access full report at https://htmlpreview.github.io/?https://github.com/NA-Dev/graph-network-centrality/blob/main/FinalProject.html
Data Sources
Film and actor data from IMDB was put into a MySQL database. A subset of this set was used, retrieved with a query specified in the final report writeup.
https://www.imdb.com/interfaces/
Supplemental data on film finacials were added by fetching programmatically from the following API.
File List
SaveDataBase.R - script to save IMDB data files to a database
Queries.R - queries to retrieve desired data subset
hollywood_dataset.csv - final dataset (queried dataset supplemented with API data)
FinalProject.Rmd - Code and writeup