Not working for Actual facebook data.
Closed this issue · 1 comments
I have a facebook friend named Deepika Meena, upon using the tool for extracting the word cloud, it gave the following error:
facebook_wordcloud -c config.json messages.htm "Deepika Meena"
Loading file...
Building HTML tree...
Parsing messages...
Traceback (most recent call last):
File "/usr/local/bin/facebook_wordcloud", line 9, in <module>
load_entry_point('facebook-wordcloud==1.21', 'console_scripts', 'facebook_wordcloud')()
File "/usr/local/lib/python2.7/dist-packages/facebook_wordcloud/command_line.py", line 51, in main
thread = message_parser.parse_thread(users)
File "/usr/local/lib/python2.7/dist-packages/facebook_wordcloud/message_parser.py", line 152, in parse_thread
raise MessageParserException("Conversation thread could not be found")
and also on using it on the given examples, it ran into this:
facebook_wordcloud -c config.json messages_sample.htm "Linus Torvalds"
Loading file...
Building HTML tree...
Parsing messages...
Found 5 messages in thread #1
RESULTS: Parsed 1 threads and 5 messages for 5 text messages
Analyzing messages for word frequencies...
Filtering out stop words...
Getting top words...
Generating mask image...
/usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
Traceback (most recent call last):
File "/usr/local/bin/facebook_wordcloud", line 9, in <module>
load_entry_point('facebook-wordcloud==1.21', 'console_scripts', 'facebook_wordcloud')()
File "/usr/local/lib/python2.7/dist-packages/facebook_wordcloud/command_line.py", line 91, in main
wordcloud = WordCloud(**wordcloud_args).generate_from_frequencies(freq_top)
File "/usr/local/lib/python2.7/dist-packages/wordcloud/wordcloud.py", line 350, in generate_from_frequencies
frequencies = sorted(frequencies.items(), key=item1, reverse=True)
AttributeError: 'list' object has no attribute 'items'
Thanks for bringing these to my attention. The error you are having with the given examples were because the interface to the wordcloud library had been changed recently...I fixed that in 6a99ae6.
Your first issue is a bigger problem. I downloaded the messages archive and I've noticed that Facebook switched from identifying users by their names to identifying them by their #######@facebook.com
address. If you figure out your friend's numeric profile ID (i.e. open up the file and manually look, or use something like http://findmyfbid.com/), it should work properly if you grab the latest version 1.1 (had to make a few changes in 5152624).
The other big issue is that Facebook did not backfill this, so messages from before the change are identified by real names still. So in a single thread you will have a mix of both options, which is a bit hard to handle. For now those messages are going to be ignored until I can come up with a better solution.
Sorry that this isn't a great solution!