/Stocktalk

Data collection toolkit for social media analytics

Primary LanguagePythonMIT LicenseMIT

                        PyPI version Build Status Python Dependencies GitHub Issues License

Quickstart

Track tweet volume and sentiment in realtime

from stocktalk import streaming, visualize

streaming(credentials, 'TSLA', ['TSLA', 'Tesla'], 30, path, realtime=True, logSentiment=True)
visualize('TSLA', 30, path)

Content

Install

pip install stocktalk

Download Corpus

stocktalk-corpus
or
python -m nltk.downloader vader_lexicon

Code Examples

Twitter Streaming

from stocktalk import streaming

# Credentials to access Twitter API 
API_KEY = 'XXXXXXXXXX'
API_SECRET = 'XXXXXXXXXX'
ACCESS_TOKEN = 'XXXXXXXXXX'
ACCESS_TOKEN_SECRET = 'XXXXXXXXXX'
credentials = [API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET]

# First element must be ticker/name, proceeding elements are extra queries
TSLA = ['TSLA', 'Tesla']
SNAP = ['SNAP', 'Snapchat']
AAPL = ['AAPL', 'Apple']
AMZN = ['AMZN', 'Amazon']

# Variables
tickers = [TSLA,SNAP,AAPL,AMZN]  # Used for identification purposes
queries =  TSLA+SNAP+AAPL+AMZN   # Filters tweets containing one or more query 
refresh = 30                     # Process and log data every 30 seconds

# Create a folder to collect logs and temporary files
path = "/Users/Anthony/Desktop/Data/"

streaming(credentials, tickers, queries, refresh, path, \
realtime=True, logTracker=True, logTweets=True, logSentiment=True, debug=True)

Realtime Visualization

from stocktalk import visualize

# Make sure these variables are consistent with streaming.py
tickers = ['TSLA','SNAP','AAPL','AMZN']
refresh = 30
path = "/Users/Anthony/Desktop/Data/"

visualize(tickers, refresh, path)

'''
Steps to run local bokeh server
1. Make sure streaming.py is running...
2. Traverse in console to the directory containing visualize.py
3. python -m bokeh serve --show visualize.py
'''

# Note: Volume is the thick blue line while sentiment is the thin white line

Major Features

Debugging Mode
Streaming Now...

---10:00:00---
TSLA Volume: 25
TSLA Sentiment: 0.29
SNAP Volume: 218
SNAP Sentiment: 0.03
AAPL Volume: 63
AAPL Sentiment: 0.14
AMZN Volume: 64
AMZN Sentiment: 0.34

---10:00:30---
TSLA Volume: 23
TSLA Sentiment: -0.05
SNAP Volume: 298
SNAP Sentiment: 0.02
AAPL Volume: 112
AAPL Sentiment: 0.01
AMZN Volume: 150
AMZN Sentiment: 0.11
Tracker Log Format
TSLA_Tracker.txt
datetime,volume,sentiment,duration
03/01/2017 10:30:00,22,0.26,30
03/01/2017 10:30:30,27,0.33,30
03/01/2017 10:31:00,24,0.23,30
03/01/2017 10:31:30,23,0.25,30
03/01/2017 10:32:00,25,0.18,30
Tweets Log Format
TSLA_Tweets.txt
datetime,tweet,sentiment
03/01/2017 10:30:02,#Tesla zeroing in market with strong relations,0.54
03/01/2017 10:30:03,$TSLA needs 8 Billion for Supercharger network,0.0
03/01/2017 10:30:03,#Tesla grossing high yet still losing money,-0.32
03/01/2017 10:30:03,Tesla's soon to be as affordable as gas-powered cars,0.11 
03/01/2017 10:30:05,The technical reason why Tesla shares could soon rise,0.42 

Underlying Features

Text Processing
textOne = "@TeslaMotors shares jump as shipments more than double! #winning"
print(process(textOne))

textTwo = "Tesla announces its best sales quarter: http://trib.al/RbTxvSu $TSLA" 
print(process(textTwo))

textThree = "Tesla $TSLA reports deliveries of 24500, above most views."
print(process(textThree))
shares jump as shipments more than double winning
tesla announces its best sales quarter
tesla reports deliveries of number above most views
Sentiment Analysis
textOne = "shares jump as shipments more than double winning"
print(sentiment(textOne))

textTwo = "tesla reports deliveries of number above most views"
print(sentiment(textTwo))

textThree = "not looking good for tesla competition on the rise"
print(sentiment(textThree))
0.706
0.077
-0.341