Academic

Automatic detection of Gen-AI texts: A comparative framework of neural models

arXiv:2603.18750v1 Announce Type: new Abstract: The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, editorial, and social domains. This paper investigates the problem of AI generated text detection through the design, implementation, and comparative evaluation of multiple machine learning based detectors. Four neural architectures are developed and analyzed: a Multilayer Perceptron, a one-dimensional Convolutional Neural Network, a MobileNet-based CNN, and a Transformer model. The proposed models are benchmarked against widely used online detectors, including ZeroGPT, GPTZero, QuillBot, Originality.AI, Sapling, IsGen, Rephrase, and Writer. Experiments are conducted on the COLING Multilingual Dataset, considering both English and Italian configurations, as well as on an original thematic dataset focused on Art and Mental Health. Results show t

C
Cristian Buttaro, Irene Amerini
· · 1 min read · 6 views

arXiv:2603.18750v1 Announce Type: new Abstract: The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, editorial, and social domains. This paper investigates the problem of AI generated text detection through the design, implementation, and comparative evaluation of multiple machine learning based detectors. Four neural architectures are developed and analyzed: a Multilayer Perceptron, a one-dimensional Convolutional Neural Network, a MobileNet-based CNN, and a Transformer model. The proposed models are benchmarked against widely used online detectors, including ZeroGPT, GPTZero, QuillBot, Originality.AI, Sapling, IsGen, Rephrase, and Writer. Experiments are conducted on the COLING Multilingual Dataset, considering both English and Italian configurations, as well as on an original thematic dataset focused on Art and Mental Health. Results show that supervised detectors achieve more stable and robust performance than commercial tools across different languages and domains, highlighting key strengths and limitations of current detection strategies.

Executive Summary

This study presents a comparative framework for detecting Gen-AI texts utilizing multiple machine learning-based detectors. The authors develop and evaluate four neural architectures (Multilayer Perceptron, 1D Convolutional Neural Network, MobileNet-based CNN, and Transformer model) on two datasets (COLING Multilingual Dataset and an original thematic dataset). The results indicate that supervised detectors outperform commercial tools in detecting AI-generated texts across languages and domains. This research provides valuable insights into the strengths and limitations of current detection strategies, emphasizing the need for more robust and language-independent AI detection methods. The findings have significant implications for academic, editorial, and social domains, underscoring the importance of developing effective AI-detection tools to combat AI-generated misinformation.

Key Points

  • The study presents a comparative framework for detecting Gen-AI texts using machine learning-based detectors.
  • Four neural architectures are developed and evaluated on two datasets, demonstrating the superiority of supervised detectors over commercial tools.
  • The research highlights the need for more robust and language-independent AI detection methods to combat AI-generated misinformation.

Merits

Strength

The study's comparative framework and evaluation of multiple neural architectures provide a comprehensive understanding of current detection strategies.

Robust performance

Supervised detectors demonstrate more stable and robust performance than commercial tools across different languages and domains.

Demerits

Limitation

The study relies on two datasets, which may not be representative of all AI-generated texts and languages.

Commercial tool comparison

The evaluation of commercial tools may be limited by their proprietary nature, restricting the ability to fully understand their detection mechanisms.

Expert Commentary

While the study makes significant contributions to the field of AI detection, it is essential to acknowledge the limitations of using proprietary commercial tools and the potential biases in the datasets employed. Furthermore, the study's findings should be considered in the broader context of AI-generated content and its potential impact on society. In light of these considerations, it is crucial to continue researching and developing more robust and language-independent AI detection methods to ensure the accuracy and reliability of AI-generated content identification. The study's emphasis on the importance of language model evaluation and the need for more comprehensive detection strategies is particularly noteworthy, highlighting the need for interdisciplinary collaboration and continued research in this area.

Recommendations

  • Future research should focus on developing more robust and language-independent AI detection methods, taking into account the limitations and biases of current approaches.
  • Interdisciplinary collaboration between researchers, policymakers, and industry stakeholders is essential for developing effective AI-detection tools and addressing the challenges posed by AI-generated misinformation.

Sources