Academic

Large Language Models Unpack Complex Political Opinions through Target-Stance Extraction

arXiv:2603.23531v1 Announce Type: new Abstract: Political polarization emerges from a complex interplay of beliefs about policies, figures, and issues. However, most computational analyses reduce discourse to coarse partisan labels, overlooking how these beliefs interact. This is especially evident in online political conversations, which are often nuanced and cover a wide range of subjects, making it difficult to automatically identify the target of discussion and the opinion expressed toward them. In this study, we investigate whether Large Language Models (LLMs) can address this challenge through Target-Stance Extraction (TSE), a recent natural language processing task that combines target identification and stance detection, enabling more granular analysis of political opinions. For this, we construct a dataset of 1,084 Reddit posts from r/NeutralPolitics, covering 138 distinct political targets and evaluate a range of proprietary and open-source LLMs using zero-shot, few-shot, an

arXiv:2603.23531v1 Announce Type: new Abstract: Political polarization emerges from a complex interplay of beliefs about policies, figures, and issues. However, most computational analyses reduce discourse to coarse partisan labels, overlooking how these beliefs interact. This is especially evident in online political conversations, which are often nuanced and cover a wide range of subjects, making it difficult to automatically identify the target of discussion and the opinion expressed toward them. In this study, we investigate whether Large Language Models (LLMs) can address this challenge through Target-Stance Extraction (TSE), a recent natural language processing task that combines target identification and stance detection, enabling more granular analysis of political opinions. For this, we construct a dataset of 1,084 Reddit posts from r/NeutralPolitics, covering 138 distinct political targets and evaluate a range of proprietary and open-source LLMs using zero-shot, few-shot, and context-augmented prompting strategies. Our results show that the best models perform comparably to highly trained human annotators and remain robust on challenging posts with low inter-annotator agreement. These findings demonstrate that LLMs can extract complex political opinions with minimal supervision, offering a scalable tool for computational social science and political text analysis.

Executive Summary

This study investigates the application of Large Language Models (LLMs) in extracting complex political opinions through Target-Stance Extraction (TSE), a task that combines target identification and stance detection. The authors construct a dataset of 1,084 Reddit posts and evaluate various LLMs using different prompting strategies. The results show that the best models perform comparably to human annotators and remain robust on challenging posts. The study's findings demonstrate the potential of LLMs in extracting complex political opinions with minimal supervision, offering a scalable tool for computational social science and political text analysis.

Key Points

  • The study employs LLMs to address the challenge of extracting complex political opinions from online conversations.
  • The authors construct a dataset of 1,084 Reddit posts covering 138 distinct political targets.
  • The results show that the best models perform comparably to human annotators and remain robust on challenging posts.

Merits

Strength

The study's use of a large and diverse dataset allows for robust evaluation of LLMs and provides a scalable tool for computational social science and political text analysis.

Robustness

The findings demonstrate the robustness of LLMs on challenging posts with low inter-annotator agreement, highlighting their potential in real-world applications.

Comparability

The results show that the best models perform comparably to highly trained human annotators, indicating the effectiveness of LLMs in extracting complex political opinions.

Demerits

Limitation

The study relies on a single dataset from Reddit, which may not be representative of all online political conversations.

Dependence on Data

The performance of LLMs may be heavily dependent on the quality and diversity of the training data.

Expert Commentary

The study's findings are significant in demonstrating the potential of LLMs in extracting complex political opinions from online conversations. However, the reliance on a single dataset from Reddit and the dependence on data quality may limit the generalizability of the results. Further research is needed to explore the application of LLMs in other contexts and to address the challenges of data quality and diversity. Additionally, the study's findings highlight the need for more nuanced analysis and understanding of public opinion, which can inform the development of policies and interventions aimed at promoting more informed public discourse.

Recommendations

  • Future studies should explore the application of LLMs in other contexts, such as social media platforms and online forums, to increase the generalizability of the results.
  • Researchers should prioritize the development of high-quality and diverse training data to improve the performance and robustness of LLMs.

Sources

Original: arXiv - cs.CL