Sampad-Hegde/Detecting-Fake-Re_Tweeters-and-Hashtag-misuse-using-Machine-Learning-and-Topic-Modelling

This Project is developed for detecting Re-tweeter is a fake user or is any Hashtag is misused in an tweet. It uses various ML algorithms like Hawkes process, LDA, KNN, SVM, Naive Bayes as well as DL models like Fully Connected (Dense) network, LSTM network, Bag of Words Model and Word Embedding (word2vec)

Jupyter Notebook

Detection of fake Re-tweeters and Mis-use of trending #HashTag in Twitter using Machine Learning, Topic Modelling (NLP) and Deep Learning Techniques

Install all the packages to your virtual environment or anaconda.

Reasearch Paper : https://ieeexplore.ieee.org/document/9824364

pip3 install -r requirements.txt

!! There are README.md in every folder that explains the specific task and how to use the module as well.

Important Versions of technologies used :

Tensor-Flow 2.4.1
Tensor-Flow-GPU 2.4.1 (Optional)
CUDA TOOLKIT 11.1 (Optional)
CUDNN 8.0 (Optional)
Nvidia GEFORCE Drivers 471.96 (Optional)
Pytorch 1.9.0+cu111

Series Of Steps in the Process :

Data Collection
Data Cleaning
Visualization on Data
Annotating the Data
Building the Model
Training and Testing the Data

Types of Models used :

Hawkes Process
LDA
KNN
SVM
Naive Bayes
Fully Connected NN
LSTM NN
Bi-LSTM NN
Word2Vec

Refrences

https://x-datainitiative.github.io/tick/modules/hawkes.html
https://pytorch.org/docs/stable/index.html
https://radimrehurek.com/gensim/models/ldamodel.html