Twitter is an enormous source of information on variety of topics, the extracted data can be used for useful insights, information extraction, sentiment analysis and much more! In this project, I have extracted tweets with the help of Tweepy API. Next I have explored a few preproccessing techniques and models for sentiment analysis. This project is still under progress, the primary objective of this project is information extraction, which is yet to be implemented.
- appCredentials.py : to specify access tokens for twitter api
- Stream.py : Classes and methods to stream tweets and attributes from twitter api.
- preprocessor.py : Preprocessing techniques:
- Stemming
- Lemmatizing
- Subjectivity and Polarity
- Frequency Distribution of words
- Count Vectorizer
- TF-IDF matrix
- Text to Sequence
- POS tagger
- Name Entity Recognizers
- models.py : Models explored:
- SVM classifier
- Naive Bayes Model
- XGBoost classifier
- LSTM
- sentiment_analysis.ipynb : Exploring the various techniques listed in the modules above