/twitter-NLP

NLP (Natural Language Processing) for twitter

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

twitter-NLP (WIP Thanks for your patience)

NLP (Natural Language Processing) python analyzer for twitter

What does it do?

NLP for twitter is a collection of routines that will allow the user to download the tweets of an specific user and run NLP (Natural Language Processing) analysis.

The results will include:

  • Wordcloud chart
  • Sentiment scatter chart
  • Tweet frequency chart
  • Word count
  • Sentence count
  • Vocabulary richnness
  • Number of stopwords used
  • Number of profanity words
  • Estimated text reading time
  • Most common words
  • Sentiment analysis

What do you need?

Libraries:

  • Python 3.0 or higher
  • Twython
  • NLTK
  • Textblob
  • Wordcloud

Twitter:

  • consumer_key
  • consumer_secret
  • access_token
  • access_token_secret

Configuration:

Create a file with the name auth.py
add the following lines:
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your token secret'

If you need to create your consumer key click here

That´s it!

Usage:

To dowload tweets:
python tweets_get.py --user <twitter_user> [--count <number_of_tweets>]
note: the maximum number of tweets is 200

To analyze tweet sentiment:
python tweets_sentiment.py --file <tweets_file.json>

To perform text analysis on tweets:
python tweets_text_analysis.py --file <tweets_file.json> --lang <en|es>

To plot sentiment scatter chart:
python tweets_scatter_v2.py --file <tweets_file.json>

To plot tweet sentiment and tweeting frequency:
python tweets_graph.py --file <tweets_file.json>

To listen to tweets:
python tweets_listener.py --track <keyword> [--lang [en|es]]
note: the number of tweets to listen to is controlled by a coded constant. Adjust this value to your needs

Have a question or a suggestion?

Feel free to add an issue to the repo or contact me at: contact@steincastillo.com

Profanity warning

Please be advised that this repository contains files with a list of profanity words used in english (EN) and spanish (ES). The only purpose of these files is to conduct NLP research and analisys on texts and should be treated as such.