An implementation of NLP on novel characters relationship extraction and analysis.
The inspiration of our project comes from an ancient Chinese classical book called The Story of the Stone. This book is famous for its large number of characters and subtle relationships between the main characters. If we can apply what we learn about NLP techniques from class into novel characters relationship analysis and visualization, it will help the audiences understand the intricate relationships quickly and clearly. Not restricted to the relationship of novel characters, it can also help us to identify character relationship from other platforms, such as news and social media. In the age of information, the data we produce every day are exponentially increasing. This project will explore the way to make us extract information effectively and promptly among large amount of data.
Harry Potter Series
- Text data collection
- Text data preprocessing
- Book words analysis
- Name entities extraction
- Name entities analysis
- Create dictionary for top 30 characters
- Subject and object of sentences analysis
- Use NLP model word2vector to analyze the relationship between characters