WhatsApp export processor.
Use this to convert your raw WhatsApp coversation exports into structured json. This works for conversations exported to Google Drive. Use this for:
- Fleeing WhatsApp for Telegram / Signal
- Archiving WhatsApp conversations
- Misinformation research (what this was originally designed for)
Extra features:
- Obfuscation: If using for research, an --obfuscation-key parameter is available to remove personally identifiable information (PII) from the conversation. So names, phone numbers, and group names will be obfuscated.
- Smart multiple export merges: Export to the same folder twice without getting duplicate messages
Usage
To export your chats
- Click the top right dot menu of your WhatsApp conversation.
- Go to More -> Export chat
- Choose either with or without media
- Choose the "Drive: Save to Drive" option.
- IMPORTANT: Move your export to a google drive folder
Installation
Download this repo and then run
pip3 install -r requirements.txt
Authentication
There are two options for authentication. Both generate a json file which needs to be passed in:
- Individual account option: Go here on your google account and enable the drive API: https://developers.google.com/drive/api/v3/quickstart/python This will give you a credentials.json file.
- Service account option: Create a service account here: https://console.developers.google.com/iam-admin/serviceaccounts After creating, click "create key" on the tab to right and download. IMPORTANT: You will then need to share the drive directory with the service account email.
Basic script usage
./whatsapp_processor.py path/to/creds.json drive.google.com/folders/drive_id mdy --verbose
For help on the CLI arguments, try ./whatsapp_processor.py --help
Acknowledgements
This was adapted from code I wrote for Tattle. Tattle is a civic tech project that builds tools and datasets to better understand and respond to (mis)information trends in India. The original code can be found here.
License
Testing
./test.sh
The test.sh file uses the coverage python module. You can see the code coverage with
firefox htmlcov/index.html