All the codes that are used to generate the results are shared.
- dl_models.py - It has all the dl models code.
- ganbert.py - It has code to generate the results with different proportions of unlabeled data with labeled data.
- cnn_model.py - It has the code to generate the CNN model-based results.
- BERT_2k.ipynb - It has all the codes to generate the transformer-based results.
- NonDL_2K.ipynb - It has all the code related to traditional machine learning results.
- wordcloud.py - It has the code to generate the word clouds for both categories.
- empath_feat.py - It is used to extract the empath features.
Data folder has our labeled and unlabeled data along with code to crawl the data.