/Detecting-Fake-Re_Tweeters-and-Hashtag-misuse-using-Machine-Learning-and-Topic-Modelling

This Project is developed for detecting Re-tweeter is a fake user or is any Hashtag is misused in an tweet. It uses various ML algorithms like Hawkes process, LDA, KNN, SVM, Naive Bayes as well as DL models like Fully Connected (Dense) network, LSTM network, Bag of Words Model and Word Embedding (word2vec)

Primary LanguageJupyter Notebook

Detection of fake Re-tweeters and Mis-use of trending #HashTag in Twitter using Machine Learning, Topic Modelling (NLP) and Deep Learning Techniques

Install all the packages to your virtual environment or anaconda.

Reasearch Paper : https://ieeexplore.ieee.org/document/9824364

pip3 install -r requirements.txt

!! There are README.md in every folder that explains the specific task and how to use the module as well.

Important Versions of technologies used :

  • Tensor-Flow 2.4.1
  • Tensor-Flow-GPU 2.4.1 (Optional)
  • CUDA TOOLKIT 11.1 (Optional)
  • CUDNN 8.0 (Optional)
  • Nvidia GEFORCE Drivers 471.96 (Optional)
  • Pytorch 1.9.0+cu111

Series Of Steps in the Process :

  • Data Collection
  • Data Cleaning
  • Visualization on Data
  • Annotating the Data
  • Building the Model
  • Training and Testing the Data

Types of Models used :

  • Hawkes Process
  • LDA
  • KNN
  • SVM
  • Naive Bayes
  • Fully Connected NN
  • LSTM NN
  • Bi-LSTM NN
  • Word2Vec

Refrences

  1. https://x-datainitiative.github.io/tick/modules/hawkes.html
  2. https://pytorch.org/docs/stable/index.html
  3. https://radimrehurek.com/gensim/models/ldamodel.html