My girls live pretty far away from me, so Whatsapp is an important part of how we communicate. So I decided to analyse our chats over the last 2 years and see what I find!
The full Jupyter notebook with comments can be found here. The raw text file for the chat can be found here
- Python
- Pandas
- Numpy
- NLTK
- Seaborn
- Matplotlib
- Wordcloud
- Emoji
I was able to extract data for 2 years from my Whatsapp chat, put it into a Pandas dataframe and answer the following questions:
- Who sent the most messages?
- What is the busiest day for chatting
- Who sends the most images?
- Who records the most voice notes?
- What are the most popular words used in our family chat?
- What is the most popular chat time?
- What Emoji's are used most in the chat?
- What does our message frequency look like over time?
- What was the overall tone of our conversation?
- Who types the most in the family?
This project allowed me to practice EDA visualisation while having fun using Python and making something to laugh with my girls about.
If you have any feedback, please reach out to me at mark@markstent.co.za