Interpreting Negation in GPT-2: Layer- and Head-Level Causal Analysis
arXiv:2603.12423v1 Announce Type: new Abstract: Negation remains a persistent challenge for modern language models, often causing reversed meanings or factual errors. In this work, we …
Abdullah Al Mofael, Lisa M. Kuhn, Ghassan Alkadi, Kuo-Pao Yang
8 views