The original dataset in "./data/" is 100% synthetic, generated by GPT-2. we are trying to see if they can be fooled as human written. Run main.py to start experiments, here are some global constants...
EXPERIMENT_NAME is the name of the folder to hold the results files
ADVERSARIAL_TYPE is the type of changes we make to each text.
TEXT_TO_CHANGE is the number of texts to make adversarial.
Adversarial Types:
-'do-nothing': Nothing is done
-'replace-char': Replace homoglyphs below
-'random-order-replace-char': Same as replace char except the input text lines are shuffled
-'misspelling': Replaces certain words with misspellings from misspellings.json.
Code for "Attacking Neural Text Detectors" (https://arxiv.org/abs/2002.11768).
Run python download_dataset.py
to download the GPT-2 top k-40 neural text test set created by OpenAI. For more documentation regarding this and similar datasets, visit https://github.com/openai/gpt-2-output-dataset.
OpenAI RoBERTa neural text detector can be downloaded by running wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-large.pt
.
Install requirements via pip install -r requirements.txt
.
Run python main.py
to run a sample experiment.