Crowd-sourced stock analyzer and stock predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis. How much do emotions on Twitter and news headlines affect a stock's price? Let's find out ...
stocksight is a crowd-sourced stock analysis open source software that uses Elasticsearch to store Twitter and news headlines data for stocks. stocksight analyzes the emotions of what the author writes and does sentiment analysis on the text to determine how the author "feels" about a stock. stocksight makes an aggregated analysis of all collected data from all sources.
Each user running stocksight has a unique fingerprint: specific stocks they are following, news sites and twitter users they follow to find information for those stocks. This creates a unique sentiment analysis for each user, based on what data sources they are getting stocksight to search. Users can have the same stocks, but their data sources could vary significantly creating different sentiment analysis for the same stock. stocksight website (coming soon) will allow each user to see other sentiment analysis results from other stocksight user app results and a combined aggregated view of all.
Version 0.2 went through an architectural revamp. You will have to COPY the v0.1 data from Elastic 5.6 to Elastic 7.3 if you wish to retain your previous data.
The ElasticSearch index mappings are also different between two versions. New version records additional data for sentiment and stock prices. Please see "src/StockSight/EsMap" files for details.
Differences:
- Each symbol have its own set of price and sentiment indexes.
- Each symbol have its dashbaord in Kibana.
- Each sentiment record have sentiment value for its title and sentiment value for its message.
- Title sentiment and message sentiment are no longer mixed together.
- Stock Price open and close values are also saved in price index.
- Docker
- Python 3. (tested with Python 3.6.8 and 3.7.4)
- Elasticsearch 7.3.1.
- Kibana 7.3.1.
- Redis 5
- Python module
- elasticsearch
- nltk
- requests
- tweepy
- beautifulsoup4
- textblob
- vaderSentiment
- pytz
- redis
- pyyaml
- fake-useragent
$ git clone https://github.com/shirosaidev/stocksight.git
$ cd stocksight
- Copy src/config.yml to src/config.yml
- Change settings in config.yml to fit your needs
- Change ElasticSearch credential if needed
- Change NLTK analyzer ignore words (see sentiment_analyzer:ignore_words:)
- Add twitter credential and change the twitter feed
- Create a new twitter application and generate your consumer key and access token.
- https://developer.twitter.com/en/docs/basics/developer-portal/guides/apps.html
- https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens.html
- Add desired stock symbol and require words to symbols section (see symbol: tsla)
- Change execution intervals in docker-composer.yml
- default, 120 seconds for stock price, 3600 seconds for news sentiment listeners
- Run "docker-compose up"
- ???
- Profit
The following action require to run in the python3 container.
- open src/config.yml
- add stock symbol to symbol section.
- add required keyword of the symbol.
- the sentiment and price listeners will pick up the change on their next run.
- Update the config.yml
- Log into python container
- kill twitter.sentiment.py
- rerun it with "python twitter.sentiment.py &"
- See SeekAlphaListener and YahooFinanceListener as example.
- Add your class to news.sentitment.py
- the sentiment runner will pick up the new listener on its next run.
- Make change to your existing template and visualizations.
- Export them to kibana_export/export.7.3.ndjson
- Replace symbol with "tmpl" or change the id and index value to match existing ndjson.
- Run "KIBANA_OVERWRITE=true python import.kibana.py"
- Log into python docker console
- Run "python delindex.py --delindex {index_name}"