GPT-2 Data based AI Content Detector

Important Links

Live URL: https://aicheck.developerpritam.in

Github : https://github.com/developer-pritam/ai-content-detector

Running a detector model

You can launch a web UI in which you can enter a text and see the detector model's prediction on whether or not it was generated by a GPT-2 model.

# (on the top-level directory of this repository)
pip install -r requirements.txt
python -m detector.server detector-base.pt

After the script says "Ready to serve", nagivate to http://localhost:8080 to view the UI.

Downloading a pre-trained detector model by openai

Download the weights for the fine-tuned roberta-base model (478 MB):

wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-base.pt

or roberta-large model (1.5 GB):

wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-large.pt

These RoBERTa-based models are fine-tuned with a mixture of temperature-1 and nucleus sampling outputs, which should generalize well to outputs generated using different sampling methods.

Training a new detector model

You can use the provided training script to train a detector model on a new set of datasets. We recommend using a GPU machine for this task.

# (on the top-level directory of this repository)
pip install -r requirements.txt
python -m detector.train

The training script supports a number of different options; append --help to the command above for usage.