Analysing r/wallstreetbets for the dankest daily updates. Check our project out at https://wsbstonks.com
We are scraping r/wallstreetbets subreddit and doing some interesting NLP analysis about stock mentions, their popularity across time, and keyword mentions.
To achieve this, we have created a Flask API that handles interaction with MongoDB NoSQL database and schedules cron jobs to scrape r/wallstreetbet with PRAW (Reddit API wrapper). The visualizations of the data are rendered by a React web app, which accesses Flask API via proxy. Both frontend and backend services are hosted on a VM on Digital Ocean Virtual Private Cloud (VPS) Droplet, while for MongoDB service, we use a MongoDB Atlas cloud cluster.
In the past, we used to host GCP App Engine serverless computing environment. If you would like to see how to configurate our React SPA and Python/Flask API for GCP App Engine, head over to App engine branch of this repo.
At the moment, keyword analysis is performed using pytextrank. Everything else is built in-house, including stock symbol recognition.
To run backend, remember to create a secrets file under backend/secrets
in INI format with keys for PRAW_CLIENT_ID, PRAW_CLIENT_SECRET, PRAW_USER_AGENT, PRODUCTION_SENTRY, MONGO_URL. Then
cd backend/
pip install -r requirements.txt
export FLASK_APP=main.py
flask run
To run frontend,
cd frontend
npm start
Our website is for informational purposes only. Nothing contained on our Site constitutes investment advice, solicitation, recommendation or endorsement of any investment strategies, practices, or individual decisions. By using this website, you have agreed to assume sole responsibility over assessing risks and merits associated with the use of any content found on the Site. Furthermore, you agree not to hold Site's creators liable for any claim for damages arising from any decision based on content on this Site.
Data used for visualizations has been accessed over Reddit API from r/wallstreetbets subreddit using PRAW.