/TextDecepter

Hard Label Black Box attack on NLP

Primary LanguagePythonMIT LicenseMIT

Hard-Label-Black-Box-Attack-on-NLP

TextDecepter: Hard Label Black Box attack on NLP

Note: The pretrained target models used for testing the attack algorithm have been taken from Textfooler

Follow the steps to run the attack algorithm

  1. Download the counter-fitted-vectors.txt and put it in counter_fitting_embedding folder

  2. Download glove embeddings, extract 'glove.6B.200d.txt' and put it in 'word_embeddings_path' folder

  3. Download pretrained target model parameters from CNN ,LSTM, BERT and put it under subdirectories 'wordCNN', 'wordLSTM' and 'BERT' in 'saved_models' folder.

  4. Use the following syntax to run the attack algorithm

!python Attack_Classification.py --dataset_path 'data/imdb.txt' --target_model 'bert' --counter_fitting_embeddings_path "counter_fitting_embedding/counter-fitted-vectors.txt" --target_model_path "saved_models/bert/imdb" --word_embeddings_path "word_embeddings_path/glove.6B.200d.txt" --output_dir "adv_results" --pos_filter "coarse"

dataset_path can be either "data/imdb.txt" or "data/mr.txt"

target_model can be either wordCNN , wordLSTM, bert or gcp

The result files can be accessed from Google Drive link