Examples of Prompts that cause ChatGPT-4 to hallucinate.

Contributors: Stephen Casper (scasper@mit.edu), Luke Bailey (lukebailey@college.harvard.edu), Zachary Marinov (zmarinov@mit.edu), Michael Gerovich (mgerov@mit.edu), Andrew Garber (andrewgarber@college.harvard.edu), Shuvom Sadhuka (ssadhuka@mit.edu), Oam Patel (opatel@college.harvard.edu), Riley Kong (rileyis@mit.edu)

Dataset: We compiled 104 examples of prompts that cause ChatGPT-4 (May 24 2023 version) to hallucinate untrue content and grouped them into 18 categories. Here we are sharing the examples. Read our post about it LINK COMING SOON.

The categories are

Arbitrarily resolving ambiguity in the prompt
Being asked to make things up (non-adversarial)
BS about fictitious things
BS about unremarkable things
BS extrapolation from trends
BS meanings of theorems
BS proofs of true theorems
BS uses of unrelated lemmas
BS references
Common misconceptions
Defending BS
Deferring to doubt
Failing to answer ‘all’
Failing to answer ‘none’
Imitating untrustworthy people (non-adversarial)
Justifying a wrong response
Making up outrageous facts
Shifts from a common setup

These categories are not and were not intended to be a complete taxonomy. There are certainly other ways to make GPT-4 output falsehoods. In general, any type of question that is difficult to answer correctly would be valid, but we focus instead on certain categories that we find to be particularly egregious.

What we hope this is useful for: Our dataset of examples is small and was collected with a just-messing-around methodology. But some might find that these examples make for decent ones to use for testing various behaviors of chatbots involving hallucination. Our taxonomy could also be useful for more systematically studying hallucination. We also invite OpenAI to fix these issues and for anyone with additional ideas or examples to send them to us so we can update the dataset :)

thestephencasper/gpt4_bs

Examples of Prompts that cause ChatGPT-4 to hallucinate.