Distinguishing text produced by Large Language Models (LLMs) from human-produced texts is challenging due to LLMs’ abilities to generalize information and present it in new contexts. Our research focuses on identifying baseline similarities for LLM-generated texts by using various MinHash techniques and comparing them to human outputs. We aim to uncover a baseline similarity score for AI-generated content, which could help classify future inputs as either AI or human-generated.
- Identify baseline similarities for LLM-generated texts.
- Explore and evaluate different MinHash techniques:
- Basic MinHashing
- K-shingling MinHashing
- SimHashing
- Cosine similarities
- Establish a baseline similarity score for AI-generated content.
- Classify future inputs (AI or human-generated) based on the baseline similarity score.
We use a dataset containing titles and abstracts of research papers from Arxiv. Each title appears twice: once with the real abstract and once with an abstract generated by GPT-3. The dataset is divided into:
- Human-generated abstracts for testing.
- Machine-generated abstracts split into 80% for training and 20% for testing.
- Basic MinHashing: Compute similarity based on Jaccard similarity.
- K-shingling MinHashing: Efficient approximation of Jaccard similarity by hashing overlapping or non-overlapping sequences of tokens.
- SimHashing: Create a fixed-size fingerprint for each document to preserve similarity in the hashed space.
- Cosine Similarities: Compare differences using cosine similarities instead of Jaccard similarities.
- Hash Functions: 128 hash functions to generate minHashes.
- Shingles and N-grams: Adjust the number of shingles for K-shingling MinHashing and explore different N-grams.
- Establish a baseline similarity score from pairwise comparisons of machine-generated abstracts in the training set.
- Compare Jaccard/Hamming Distance similarities for human-generated abstracts with machine-generated abstracts.
- Adjust classification thresholds to optimize accuracy.
- Optimal Parameters: 4-shingle MinHash using BertTokenizer achieved the highest classification accuracy for the testing set.
- Graphical Analysis: Generated graphs for each parameter to determine optimal settings for K-shingle and N-grams.
- Case Study: Successfully classified human and AI-generated essays using prompt-based essays with a simple model and prompt engineering.
Our study demonstrates the effectiveness of various MinHashing techniques in distinguishing AI-generated content from human-generated content. By identifying key parameters and evaluating different methods, we achieved consistent and accurate detection of AI-generated abstracts. Our findings support the viability of similarity classification through MinHash in detecting AI-generated content.
We are exploring the viability of developing a system to classify essays using a training set generated through prompt engineering. This system could serve as a commercial application of our methodology.
- Fröhling, Leon, and Arkaitz Zubiaga. “Feature-based detection of automated language models: tackling GPT-2, GPT-3 and Grover.” PeerJ. Computer science vol. 7 e443. 6 Apr. 2021, doi:10.7717/peerj-cs.443
- Kirchenbauer, John, et al. “A Watermark for Large Language Models.” arXiv [Cs.LG], 2023, http://arxiv.org/abs/2301.10226. arXiv.