Diego Duque
The purpose of this project is to understand and analyze tweets related to Apple. This company has been getting more and more popular over the years. It is a contemporary reference for high-tech. Using Python brings us the opportunity to massively understanding tweets related to anything, in this case, related to Apple.
The project is designed to unsupervised learning as a final result. Topic model is also performed and at the end of the code will be provided sentiment analysis related to the dataset.
This public Kaggle dataset will be used: 1624 unique values (tweets).
- Sklearn
- from sklearn.feature_extraction.text import TfidfVectorizer, ENGLISH_STOP_WORDS
- from sklearn.decomposition import NMF
- from sklearn.cluster import KMeans
- from sklearn.manifold import TSNE
- NLTK
- from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
-
Numpy
-
Pandas
- import pandas_datareader.data as web
- Plotting
- import matplotlib.pyplot as plt %matplotlib inline
- from mpl_toolkits.mplot3d import Axes3D
topic_0: STOCK - AAPL stock price, investing ideas and techinques topic_1: Negative_Sentiment - f*** and hate topic_2: Products: Apple products and technical issues topic_3: Apple_Event - rumors, on new devices, new iPhone topic_4: Chargers - related to iPhone and macbook charger bad quality topic_5: STOCK and trade - AAPL stock, and Apple's global trade topic_6: Positive_Sentiment - happy customers topic_7: Founders - Steve Jobs and Wozniak topic_8: Quality - negative coments again broken products and iOS
Positive
Negative
k = 9 so here shows 9 different clusters by colors as plotting by this dimensionality reduction method called TSNE on 2 dimensions.