This project involves :
- Extensive study on centrality of main protagonists i.e. degree, betweenness, closeness, PageRank.
- Calculate the global clustering coefficient of your graph and local clustering coefficient of the main protagonist nodes.
- Detecting communities.
- Find the degree distribution, average shortest path, size of the largest component.
- Creation of an equivalent generative model to compare against the social graph that we extracted.
- Make a list of characters in the novel. You need to decide whom to include. For example, for Mahabharata, there is no point in including a character representing random soldier (;-
- Extract a social graph of the manually identified characters in the text ( as shown in the hands on session) . For doing this, you need to use a co-occurrence algorithm as discussed and shown in demo in class
- Calculate the four types of centrality of main protagonists i.e. degree, betweenness, closeness, PageRank . (Ref : centrality analysis)
- Calculate the global clustering coefficient of your graph and local clustering coefficient of the main protagonist nodes. Detect communities ( Ref : Measures of cohesion)
- Find the degree distribution, average shortest path, size of the largest component. Also create an equivalent generative model to compare against the social graph that you extracted (Ref : Generative models)
- What you know of the story and is it matching with what you got from your network analysis ?
- Have you got any insight to offer ?
- Who are the protagonists as per your analysis? If the 4 centrality are not having high correlation, how do you interpret ?
- What do the clustering coefficients, discovered communities, extracted ego network of protagonists and average shortest path tell you about the dynamics in the story ?
- You have compared against a generative model (Random graph, Watts and Strogatz, Preferential Attachment etc.). The parameters from that model and those from your extracted graph, when compared , tell you what ?
- Feel free to do any appropriate visualization using Gephi only to substantiate your analysis