Academic

Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation

arXiv:2603.13891v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for automated text annotation in tasks ranging from academic research to content moderation and hiring. Across 19 LLMs and two experiments totaling more than 4 million annotation judgments, we show that subtle identity cues embedded in text systematically bias annotation outcomes in ways that mirror racial stereotypes. In a names-based experiment spanning 39 annotation tasks, texts containing names associated with Black individuals are rated as more aggressive by 18 of 19 models and more gossipy by 18 of 19. Asian names produce a bamboo-ceiling profile: 17 of 19 models rate individuals as more intelligent, while 18 of 19 rate them as less confident and less sociable. Arab names elicit cognitive elevation alongside interpersonal devaluation, and all four minority groups are consistently rated as less self-disciplined. In a matched dialect experiment, the same sentence is judged significan

P
Petter T\"ornberg
· · 1 min read · 19 views

arXiv:2603.13891v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for automated text annotation in tasks ranging from academic research to content moderation and hiring. Across 19 LLMs and two experiments totaling more than 4 million annotation judgments, we show that subtle identity cues embedded in text systematically bias annotation outcomes in ways that mirror racial stereotypes. In a names-based experiment spanning 39 annotation tasks, texts containing names associated with Black individuals are rated as more aggressive by 18 of 19 models and more gossipy by 18 of 19. Asian names produce a bamboo-ceiling profile: 17 of 19 models rate individuals as more intelligent, while 18 of 19 rate them as less confident and less sociable. Arab names elicit cognitive elevation alongside interpersonal devaluation, and all four minority groups are consistently rated as less self-disciplined. In a matched dialect experiment, the same sentence is judged significantly less professional (all 19 models, mean gap $-0.774$), less indicative of an educated speaker ($-0.688$), more toxic (18/19), and more angry (19/19) when written in African American Vernacular English rather than Standard American English. A notable exception occurs for name-based hireability, where fine-tuning appears to overcorrect, systematically favoring minority-named applicants. These findings suggest that using LLMs as automated annotators can embed socially patterned biases directly into the datasets and measurements that increasingly underpin research, governance, and decision-making.

Executive Summary

This study examines the impact of large language models (LLMs) on text annotation tasks, revealing systematic biases that mirror racial stereotypes. Across 19 LLMs and two experiments, the authors demonstrate how subtle identity cues embedded in text influence annotation outcomes. The results show that LLMs tend to rate individuals with Black names as more aggressive, those with Asian names as more intelligent but less confident, and those with Arab names as less self-disciplined. Furthermore, the study finds that LLMs judge sentences written in African American Vernacular English as less professional and more toxic. The findings have significant implications for research, governance, and decision-making, highlighting the need for caution when relying on LLMs as automated annotators.

Key Points

  • Large language models (LLMs) exhibit systematic biases in text annotation tasks
  • LLMs mirror racial stereotypes in their annotation outcomes
  • The study's findings have significant implications for research, governance, and decision-making

Merits

Strength in methodology

The study employs a rigorous methodology, including two experiments and 19 LLMs, to investigate the impact of LLMs on text annotation tasks.

Demerits

Limitation in generalizability

The study's findings may not be generalizable to other LLMs or annotation tasks, highlighting the need for further research in this area.

Expert Commentary

This study provides a timely and important contribution to the growing body of research on the limitations and biases of LLMs. By demonstrating the systematic biases that LLMs exhibit in text annotation tasks, the authors highlight the need for greater attention to fairness, equity, and social justice in AI development and deployment. The study's findings have significant implications for research, governance, and decision-making, and underscore the need for caution when relying on LLMs as automated annotators. Furthermore, the study's methodology and results provide a valuable foundation for future research in this area, including investigations into the sources and consequences of LLM bias.

Recommendations

  • Develop and deploy LLMs that are explicitly designed to be fair, equitable, and unbiased
  • Establish robust evaluation protocols and testing frameworks to detect and mitigate LLM bias

Sources