/Film_Script_Analysis

The aim of this project is to provide detailed insights into different movies analyzed focusing on the characters, their dialogues, scene locations, emotional and sentiment analysis of the whole movie and the individual characters, character's interaction with one another and finally gender distribution in the each movie analyzed.

Primary LanguageJupyter Notebook

Film_Script_Analysis

The aim of this project is to provide detailed insights into different movies analyzed focusing on the characters, their dialogues, scene locations, emotional and sentiment analysis of the whole movie and the individual characters, character's interaction with one another and finally gender distribution in the each of the movie analyzed.

Sample Output

Highlights

  • 1000+ movie scripts were scraped from IMSDB Database and segmented into SCENE LOCATIONS/NAMES, SCENE ACTIONS, SCENE CHARACTERS AND SCENE DIALOGUES.

  • 20 Movies Script Visualizations (based on in-depth analysis) were done, these scripts were Randomly chosen from the 1000+ movie script I segmented to TEST the authenticity of the Movie Script Analytical Algorithm

- Highly recommended for visualization: The graphs/plots for the 20 Movie Scripts can be viewed using this link:

This is because the plotly graphs cannot be viewed in github.

To actualize this project, the following objectives were executed sequentially:

  1. Web scraping of the movie scripts (Over 1000+ movies were scraped from IMSDB website)

  2. Movies segmentation into Scenes --> Scene Location, Scene Action/Description, Scene Dialogues, Scene Characters (All the movies scraped were segmented except those that do not follow the "Screenplay format i.e. INT / EXT)"

  3. Character extraction and appearances plot ---> Here, characters were plotted based on how many times they appeared and spoke in each scene and across the movie.

  4. Character Interaction Mapping --> We mapped out the connection between all the characters in the movie and also the interaction between the Top 10 characters in the movie.

  5. Here, we looked at the Most mentioned character based on the Scene dialogues and also the characters each character mention the most in their conversation.

  6. Similar to Number 5., Here looked at who a specific character talks with the most in the Movie.

  7. Emotional and Sentiment Analysis across the whole movie and for each individual character, However for this project we limited it to only the Top 10 characters. ---> This gives us the character's emotion when he/she appears in the movie.

  8. Additional Scene Informations --> Exact Scene Locations, Scenes with dialogs and no dialogs, Scenes that occurred during the Day or in the Night, Scenes location based on Outdoor or Indoor appearances.

  9. Gender Distribution in the movie.

python modules for this project:

imsbd_moviescript_scraper_AND_Scene_Segmentation.py, -- scraped html text from IMSDB database and segmented Movie into scenes,

dialogue_appearance.py --- dialogue appearances,

characters_extract.py --- extract characters and visualize the number of times they appeared,

xter_interaction.py ---- visualize character intercation mapping,

characters_mt.py --- character mentions,

emotions.py --- Emotional arcs and sentiment analysis on movie and the script,

movie_info.py --- Movie information e.g scene location, time of day ccurences for each scenes,

gend_distribution_plot.py --- Gender distribution.

Tools: Python libraries: pandas, numpy, Regular expression (Regex)--> Regex the major Engine for text cleaning and Movie Scene Segmentation, plotly, nltk, seaborn, networkx