Papers
arxiv:2511.06676

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

Published on Nov 10, 2025

Abstract

A study examines bias in AI toxicity detection models, finding systematic disparities between African-American English and Standard American English, and presents an interactive tool to demonstrate these biases.

AI-generated summary

Now that AI-driven moderation has become pervasive in everyday life, we often hear claims that "the AI is biased". While this is often said jokingly, the light-hearted remark reflects a deeper concern. How can we be certain that an online post flagged as "inappropriate" was not simply the victim of a biased algorithm? This paper investigates this problem using a dual approach. First, I conduct a quantitative benchmark of a widely used toxicity model (unitary/toxic-bert) to measure performance disparity between text in African-American English (AAE) and Standard American English (SAE). The benchmark reveals a clear, systematic bias: on average, the model scores AAE text as 1.8 times more toxic and 8.8 times higher for "identity hate". Second, I introduce an interactive pedagogical tool that makes these abstract biases tangible. The tool's core mechanic, a user-controlled "sensitivity threshold," demonstrates that the biased score itself is not the only harm; instead, the more-concerning harm is the human-set, seemingly neutral policy that ultimately operationalises discrimination. This work provides both statistical evidence of disparate impact and a public-facing tool designed to foster critical AI literacy.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.06676 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.06676 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.06676 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.