
Categorizing Sexism and Misogyny through Neural Approaches

We hereby release the code used for our research paper under review at TWEB titled "Categorizing Sexism and Misogyny through Neural Approaches". Our implementation utilizes parts of the code from [1, 2, 3, 4] and libraries Keras and Scikit-learn [5]. The following are brief descriptions of some of the contents of this repository.

  1. main.py
  • The main file that needs to be run for all deep learning based methods including the proposed approach and baselines
  1. neural_approaches.py
  • Training, prediction, evaluation, training data creation/transformation, loss function assignment, class imbalance correction
  1. dl_models.py
  • Deep learning architectures for the proposed approach as well as baselines
  1. load_pre_proc.py
  • Data loading, pre-processing, problem transformation, functions wrt our ensemble method, and other utilities
  1. sent_enc_embed.py
  • Generation of sentence representations using general-purpose sentence encoding schemes
  1. word_embed.py
  • Generation of distributional word representations
  1. ling_word_feats.py
  • Generation of a linguistic/aspect-based word-level representation
  1. gen_batch_keras.py
  • Generation of batches of inputs for training and testing
  1. auto_encode.py
  • Functions related to the autoencoder-based method for using unlabeled data and the pre-training of BERT on a domain-specific corpus (esp. around data creation)
  1. eval_measures.py
  • Functions related to multi-label evaluation and result reporting
  1. traditional_ML.py
  • Traditional machine learning methods on ngram based and other features
  1. doc2vec_embed.py
  • Creation of a vector representation of a piece of text using doc2vec
  1. rand_approach.py
  • Random label assignment in accordance with normalized training frequencies of labels
  1. rand_sample.py
  • Creation of a small random sample of the data for quick experimentation
  1. split_labels.py
  • Label subset generation for our ensemble approach
  1. att_visualize.py
  • Functions used for quantitative and qualitative analysis
  1. config_deep_learning.txt
  • A sample configuration file for deep learning methods specifying multiple nested and non-nested parameter combinations
  1. config_traditional_ML.txt
  • A sample configuration file for traditional machine learning methods


