All the codes that are used to generate the results are shared.

  1. dl_models.py - It has all the dl models code.
  2. ganbert.py - It has code to generate the results with different proportions of unlabeled data with labeled data.
  3. cnn_model.py - It has the code to generate the CNN model-based results.
  4. BERT_2k.ipynb - It has all the codes to generate the transformer-based results.
  5. NonDL_2K.ipynb - It has all the code related to traditional machine learning results.
  6. wordcloud.py - It has the code to generate the word clouds for both categories.
  7. empath_feat.py - It is used to extract the empath features.

Data folder has our labeled and unlabeled data along with code to crawl the data.