Photo was created by Webthat using MidJourney
AI Text Detectors and Discrimination
Computer programs designed to identify AI-generated content, such as essays and job applications, have been found to discriminate against non-native English speakers, according to researchers.
Bias Uncovered in AI Text Detectors
Tests conducted on seven popular AI text detectors revealed a troubling bias. Articles written by individuals who do not speak English as their first language were frequently mislabeled as AI-generated, posing significant implications for students, academics, and job seekers.
Questioning Accuracy Claims
While teachers consider AI detection a crucial tool in deterring modern cheating methods, the researchers caution that the 99% accuracy claimed by some detectors is misleading at best.
Non-Native English Speakers Unfairly Targeted
When essays written for the Test of English as a Foreign Language (TOEFL) were analyzed, more than half were falsely identified as AI-generated, with one program flagging 98% of the essays as machine-composed. In contrast, when essays by native English-speaking eighth graders in the US were tested, over 90% were accurately identified as human-written.
Decoding the Discrimination
The researchers, publishing their findings in the journal Patterns, attribute this bias to the way detectors evaluate what is considered human versus AI-generated text. These programs rely on “text perplexity,” which measures how surprised or confused a generative language model is when predicting the next word in a sentence.
As large language models like ChatGPT are trained to produce low perplexity text, human-written work using common words and familiar patterns runs the risk of being mistaken for AI-generated content. This risk is particularly pronounced for non-native English speakers, as they often employ simpler word choices.
A Paradoxical Response
After uncovering the inherent bias in AI detector programs, the researchers experimented by having ChatGPT rewrite TOEFL essays using more sophisticated language.
Surprisingly, when these revised essays were reevaluated by the AI detectors, they were correctly identified as human-written. However, the researchers raise concerns that the use of AI detectors may ironically lead non-native writers to rely even more on AI tools like ChatGPT to evade detection.