End-to-end data science project in Python: 1) data scraping, 2) sentiment analysis, 3) and data visualization.
I was curious to see if sentiment scores could be used to visualize character arcs. If so, could sentiment analysis help writers evaluate character development in their work?
For this project, I analyzed one of my favorite TV shows Avatar: The Last Airbender. I used Jupyter Notebook for documentation. Follow along to see how to:
- Scrape the web for episode transcripts with Beautiful Soup
- Manipulate data with pandas and analyze character dialogue using VADER
- Create interactive visualizations of the sentiment scores with Plotly Express
See the accompanying Medium blog post for detailed project tutorial.
Without prior knowledge of the series, I could've guessed Azula is the villain by looking at the “Sentiments per episode” plot. Her trend line increases before a sharp decline towards the finale (typical for stories where the good guys win). Although sentiment progression is not a perfect proxy for character development, in the future it might be part of a larger algorithm that’ll help writers evaluate their work. See Medium blog post for further discussion.
You must have Jupyter Notebook installed on your computer. Download atla_sentiment_analysis.ipynb to current directory and open Jupyter Notebook by running jupyter notebook
in the command line.