The ChatGPT Detector is Laughably Bad

My test results using OpenAI's GPT detector.

Feb 05, 2023

First of all, (and full disclaimer), I was already biased against OpenAIs ChatGPT detector before deciding to test it. Companies like OpenAI are probably not developing AI detection software to help catch cheats. They’re most likely developing it so that they don’t train future AIs on the output of today’s AIs.

But the media has glommed on to the idea of a cheat detector, so I decided to put it through its paces as such.

The Test

Step 1: Let’s establish a baseline:

Ok, let’s start with something that definitely wasn’t written by an AI - the first dozen or so verses from the Bible:

Unclear if it’s… an AI. Is God an AI? [foundations shaken]

Step 2: Have ChatGPT generate some text:

Next up, feed it some 100% pure grass-fed organic AI generated text.

Step 3: Run it through OpenAI’s detector:

The results:

Possibly. Possibly? Directly from the mouths of AIs into the classifier and the result is “possibly”? I mean, anything is possible, right? What’s a teacher or other interested party supposed to do with “possibly?”. Hopefully, absolutely, nothing.

Which brings me to my biggest worry about AI detection: false positives. If a detector says “possibly” or “probably” AI, how does anyone appeal that? A professional artist was already accused of using AI to generate art that he spent 100 hours on. The fallout from false positives is going to suck.

Step 4: Light editing to improve the score:

For the final test, I’m doing what any ambitious 8th grader would do. I’m making some alterations to beat the detector.

Were those edits enough?

As far as the classifier was concerned, it’s now as good as gospel.

Why is it So Bad?

I get it’s early, and OpenAI is putting the detector (technically, a “classifier”) out there to gather feedback and data, but why isn’t it better out of the gate? OpenAI can conceivably spend a million GPU-hours generating known AI text to train it on. For the non-AI test data they can use, oh, I don’t know, everything single written before 2021?

Maybe any sufficiently advanced AI author is indistinguishable from a regular human.

Researchers are working on embedding “watermarks” in the generated text which should greatly improve detection, but,

Not every text generator will use the same watermark.
This is still only targeted at preventing future AIs from training on existing AI outputs, because…
Hand editing of the text (or running it through a different tool) will corrupt the watermark.

People, don’t count on AI detection. Anyone who tells you otherwise is a prophet peddling false hope.

About the Author: Scott Swigart is the Chief Research Officer at Cascade Insights, a firm providing technology, buyer, and strategy research for B2B technology companies.

B2B Technology Frontiers

Discussion about this post