Facebook has a feature that allows users to download a copy of their data as a zip archive containing htm files with their data. The aim of this parser is to take this archive and to extract a user's Facebook Messages from it; to transfer them into a more useful format, as well as performing some analysis to produce interesting data.
This code is adapted from CopOnTheRun/FB-Message-Parser.
The Facebook Export can be downloaded from the Facebook Settings menu.
Before any code can be run: Lines 26 and 27 in fb_parser.py
will need to be updated to the name and username of the account being parsed. If this is done, the code will attempt to open the zip file facebook-[myusername].zip
by default if no argument is given to facebook.py
.
Run "python facebook.py [optional_filename]
" with the facebook-[myusername].zip
or messages.htm
files in the same directory to export to CSV, display top 10 most messaged friends and output a graph showing messages with the most messaged friend. This sample code can easily be adapted.
The fb_chat.Chat
object returned by the parser (the object called Facebook.Chat
in facebook.py
) could be pickled and loaded in another program to form a base API to interact with the messages there. (Note that this, like the export, contains private messages in plain text format, and that the fb_chat
code may need to be imported too).
Producing Graphs
The fb_analysis.py
file contains code to produce a stacked histogram showing the number of messages sent and recieved with a contact each month:
A browser-based interface
If you want to view the export in a browser (and don't want to use the perfectly servicable way of viewing Facebook Messages in a browser that is www.facebook.com
) then Flask Facebook Messages may be of use. Add Facebook.dump_to_pickle()
on a new line after Line 52 of facebook.py
to produce a pickle export, then use the code in that repository to view it!
The code is written in Python 2.7.
The parser uses Beautiful Soup to do the bulk of the capture from the htm file.
The analysis code uses matplotlib to produce graphs of message counts. An example graph can be found in the samples
directory.
Anaconda Python for scientific computing is a simple and easy way to install all the dependencies for the code, alongside many other useful libraries. It can be downloaded here.