Academic

Large Language Models in the Abuse Detection Pipeline

arXiv:2604.00323v1 Announce Type: new Abstract: Online abuse has grown increasingly complex, spanning toxic language, harassment, manipulation, and fraudulent behavior. Traditional machine-learning approaches dependent on static classifiers and labor-intensive labeling struggle to keep pace with evolving threat patterns and nuanced policy requirements. Large Language Models introduce new capabilities for contextual reasoning, policy interpretation, explanation generation, and cross-modal understanding, enabling them to support multiple stages of modern safety systems. This survey provides a lifecycle-oriented analysis of how LLMs are being integrated into the Abuse Detection Lifecycle (ADL), which we define across four stages: (I) Label \& Feature Generation, (II) Detection, (III) Review \& Appeals, and (IV) Auditing \& Governance. For each stage, we synthesize emerging research and industry practices, highlight architectural considerations for production deployment, and examine the s

arXiv:2604.00323v1 Announce Type: new Abstract: Online abuse has grown increasingly complex, spanning toxic language, harassment, manipulation, and fraudulent behavior. Traditional machine-learning approaches dependent on static classifiers and labor-intensive labeling struggle to keep pace with evolving threat patterns and nuanced policy requirements. Large Language Models introduce new capabilities for contextual reasoning, policy interpretation, explanation generation, and cross-modal understanding, enabling them to support multiple stages of modern safety systems. This survey provides a lifecycle-oriented analysis of how LLMs are being integrated into the Abuse Detection Lifecycle (ADL), which we define across four stages: (I) Label \& Feature Generation, (II) Detection, (III) Review \& Appeals, and (IV) Auditing \& Governance. For each stage, we synthesize emerging research and industry practices, highlight architectural considerations for production deployment, and examine the strengths and limitations of LLM-driven approaches. We conclude by outlining key challenges including latency, cost-efficiency, determinism, adversarial robustness, and fairness and discuss future research directions needed to operationalize LLMs as reliable, accountable components of large-scale abuse-detection and governance systems.

Executive Summary

This article provides a comprehensive analysis of the integration of Large Language Models (LLMs) in the Abuse Detection Lifecycle (ADL), spanning four stages: Label & Feature Generation, Detection, Review & Appeals, and Auditing & Governance. The authors synthesize emerging research and industry practices, highlighting strengths and limitations of LLM-driven approaches. Key challenges identified include latency, cost-efficiency, determinism, adversarial robustness, and fairness. The study concludes with future research directions to operationalize LLMs in large-scale abuse-detection and governance systems. The authors' lifecycle-oriented analysis offers valuable insights for practitioners and researchers seeking to leverage LLMs in abuse detection.

Key Points

  • LLMs introduce new capabilities for contextual reasoning, policy interpretation, explanation generation, and cross-modal understanding.
  • The authors define the Abuse Detection Lifecycle (ADL) across four stages: Label & Feature Generation, Detection, Review & Appeals, and Auditing & Governance.
  • The study highlights key challenges in integrating LLMs, including latency, cost-efficiency, determinism, adversarial robustness, and fairness.

Merits

Comprehensive Analysis

The authors provide a thorough examination of the integration of LLMs in the ADL, covering all stages and highlighting both strengths and limitations.

Practical Relevance

The study's focus on practical applications and industry practices makes it highly relevant to practitioners seeking to leverage LLMs in abuse detection.

Future Research Directions

The authors' identification of key challenges and future research directions offers valuable insights for researchers seeking to operationalize LLMs in large-scale abuse-detection systems.

Demerits

Scope Limitations

The study's focus on LLMs may limit its scope, neglecting potential benefits and challenges of other approaches, such as traditional machine learning methods.

Methodological Limitations

The authors' reliance on existing research and industry practices may introduce methodological limitations, such as bias towards established solutions.

Expert Commentary

The article provides a timely and comprehensive analysis of the integration of LLMs in the Abuse Detection Lifecycle. The authors' lifecycle-oriented approach offers valuable insights for practitioners and researchers seeking to leverage LLMs in abuse detection. However, the study's focus on LLMs may limit its scope, and the authors' reliance on existing research and industry practices may introduce methodological limitations. Nevertheless, the study's findings highlight the potential benefits and challenges of integrating LLMs in abuse detection systems, with implications for both practical applications and policy development.

Recommendations

  • Future research should investigate the potential benefits and challenges of integrating LLMs with traditional machine learning methods in abuse detection systems.
  • Practitioners should carefully consider latency, cost-efficiency, determinism, adversarial robustness, and fairness when implementing LLMs in abuse detection systems.

Sources

Original: arXiv - cs.CL