Anomaly detection, pump and dump classification in OTC market stocks using sentiment analysis and feature engineering of twitter feeds


Project Intro/Objective

Over the past several decades, advances in technology have significantly impacted all aspects of the financial system. While it has led to numerous benefits, it has also increased the methods for manipulating the market. A frequent platform used to perform these market manipulation schemes has been through social media. Twitter is one such platform where people and potential "pumpers" also tweet about certain stocks to increase their price above a certain value. This is manipulative behavior and is known as a "pump and dump" scheme. The objective of this project to classify the stock pumps and which factors contribute to them. The null hypothesis I am making is that the tweet sentiments 'may' contribute towards it significantly.

Project Status ✅

  1. Coded extensively in Python
  2. on Jupyter
  3. and in Google colab
  4. More than 10 million tweets on 1300 different $cashtags are downloaded via Twint. This is because there is no restrictions on the amount of tweets which can be scrapped.

The files and folders which contain the work

  1. Link to project report