It's well-known now that text-generating "AI" apps are inordinately fond of certain words—"delve" being maybe the most amusing lexical Voight-Kampff indicator—but no sooner are such words listed than the AI can be told to avoid using them. And so we move on to a second generation of linguistic tells, as exposed by more thorough statistical research into AI sloptext.
By highlighting hundreds of so-called "marker words" that became significantly more common in the post-LLM era, the telltale signs of LLM use can sometimes be easy to pick out. Take this example abstract line called out by the researchers, with the marker words highlighted: "A comprehensive grasp of the intricate interplay between […] and […] is pivotal for effective therapeutic strategies."
After doing some statistical measures of marker word appearance across individual papers, the researchers estimate that at least 10 percent of the post-2022 papers in the PubMed corpus were written with at least some LLM assistance. The number could be even higher, the researchers say, because their set could be missing LLM-assisted abstracts that don't include any of the marker words they identified.
The impression is of managerial English transposed to scientific topics. Which was, I think, John W. Campbell's enduring contribution to the language of Scientology. Delving pivotally into the intricate interplay of our technology-mediated brain rot, you might say—or not.
Previously:
• I asked a bot to re-write Ernest Cline's READY PLAYER TWO
• I played this AI-generated text adventure as the Mandalorian and here's what happened
• AI text generator trained on 4chan