Whatsapp is probably one of the most popular messaging apps around and based on this article, it is approaching the 1 billion users mark. Nearly everyone that I know is using Whatsapp on a daily basis and I believe that there are some useful patterns or behaviours that we can extract from the group chats.
The script will first try to parse the uploaded chat history file (.txt) using regex and it will group the parsed messages into two pandas dataframes. One for normal messages, the other for actions such as changing chat subject, change group icon, adding users, removing users, ... etc.
Then it will produce an ugly but simple chart of some statistics from the messages dataframe. I have not work on analytics on the Whatsapp actions yet, will probably do so when I have more free time :)
date_patterns = {
"long_datetime" : "(?P<datetime>\d{1,2}\s{1}\w{3}(\s{1}|\s{1}\d{4}\s{1})\d{2}:\d{2})",
"short_datetime" : "(?P<datetime>\d{2}/\d{2}/\d{4},\s{1}\d{2}:\d{2})"
}
message_pattern = "\s{1}-\s{1}(?P<name>(.*?)):\s{1}(?P<message>(.*?))$"
action_pattern = "\s{1}-\s{1}(?P<action>(.*?))$"
Simply add more key value pairs in the date_patterns dictionary.
TODO
Special thanks to D|Science, I used his code for generating the charts.