Students fight false accusations from AI-detection snake oil

The pressures of teaching create blind spots for a certain kind of snake oil—snake oil such as AI detection software. As inaccurate as it is, it works as administrative proof for forensically unprovable suspicions. Use of the snake oil is backed by zero-tolerance policies, weak or absent investigative processes, psychological buy-in effects and discriminatory enforcement. And those targeted end up in an all-or-nothing fight with bureacracy—the kind most likely to be resolved by threats of public humiliation, litigation or the generous application of political corrosives. Bloomberg reports on the students fighting back, such as Moira Olmsted, a teaching student at Central Methodist University who was falsely accused of using AI by her instructor there.

Just weeks into the fall semester, Olmsted submitted a written assignment in a required class—one of three reading summaries she had to do each week. Soon after, she received her grade: zero. When she approached her professor, Olmsted said she was told that an AI detection tool had determined her work was likely generated by artificial intelligence. In fact, the teacher said, her writing had been flagged at least once before.

For Olmsted, now 24, the accusation was a "punch in the gut." It was also a threat to her standing at the university. "It's just kind of like, oh my gosh, this is what works for us right now—and it could be taken away for something I didn't do," she says.

Note the instructor's mindless "computer says no" reasoning. But you were flagged, Moira!

The grade was ultimately changed, but not before she received a strict warning: If her work was flagged again, the teacher would treat it the same way they would with plagiarism.

We admit it was wrong, but we assure you it will be wrong again!

Businessweek tested two of the leading services—GPTZero and Copyleaks—on a random sample of 500 college application essays submitted to Texas A&M University in the summer of 2022, shortly before the release of ChatGPT, effectively guaranteeing they weren't AI-generated. The essays were obtained through a public records request, meaning they weren't part of the datasets on which AI tools are trained. Businessweek found the services falsely flagged 1% to 2% of the essays as likely written by AI, in some cases claiming to have near 100% certainty.

Spot the flaws in the methodological assumptions. Though maybe they just got written up too slightly here. Even respecting those assumptions, a 2% error rate is unacceptable. You'd be making false accusations every other assignment. And the reality is they're not that accurate. Turnitin admitted a 4% error rate, but otherwise "declined to make its service available" to Bloomberg for its reportage.

Moreover, AI-detecting AI tends to target writing produced by disadvantaged groups.

The students most susceptible to inaccurate accusations are likely those who write in a more generic manner, either because they're neurodivergent like Olmsted, speak English as a second language (ESL) or simply learned to use more straightforward vocabulary and a mechanical style, according to students, academics and AI developers.

A superficial problem is that AI detection research is strongly influenced by AI researchers and other academics eager to benefit from AI's potential uses. It's a new field and everyone is awfully enthusiastic! A deeper problem is that the snake oil providers are already offering to cure the problems caused by their own product: they want students to sign up themselves, use their services, provide all the writing they ever do, and the rest of it. They want to speedrun the enshittification triangle where they end up controlling what both the institutional user and the end user get to know about one another and can squeeze both.

Most of the schools that work with Copyleaks now give students access to the service, Yamin says, "so they can authenticate themselves" and see their own AI scores. Turnitin, meanwhile, is working to expand its AI product portfolio with a service to help students show the process of how they put together their written assignments, in response to feedback from teachers and pupils.

Authenticate yourself to the basilisk and it will play peek-a-boo! The final word: there are now services that edit texts to pass the AI-detection tests. Bloomberg found one of them, Hix Bypass, to be extremely effective.

A Bloomberg test of a service called Hix Bypass found that a human-written essay that GPTZero incorrectly said was 98.1% AI went down dramatically to 5.3% AI after being altered by the service.

Bloomberg doesn't say if it tested Hix Bypass with AI-written essays.