GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping
arXiv:2603.12275v1 Announce Type: new Abstract: Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework, is designed
arXiv:2603.12275v1 Announce Type: new Abstract: Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework, is designed to leverage graph connectivity and identify anchor correlated neighbors, enforcing a precise decision boundary between the forgotten fact and its semantic neighborhood. Evaluations on LLaMA-3-8B and Mistral-7B across multiple knowledge editing and unlearning methods showcase NEDS's superior performance (1.000 on unlearning efficacy and 0.839 on locality) on GONE and other benchmarks. Code is available at https://anonymous.4open.science/r/GONE-4679/.
Executive Summary
This article presents GONE, a benchmark for evaluating knowledge unlearning in Large Language Models (LLMs) using structured knowledge graph (KG) facts. The authors propose Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework that leverages graph connectivity to identify anchor correlated neighbors and enforce a precise decision boundary between the forgotten fact and its semantic neighborhood. Evaluations on LLaMA-3-8B and Mistral-7B demonstrate NEDS's superior performance on unlearning efficacy and locality. The article highlights the importance of addressing knowledge unlearning in LLMs due to safety, privacy, and intellectual property concerns. The authors' approach addresses the limitations of existing methods focused on flat sentence-level data and provides a more comprehensive evaluation of knowledge unlearning.
Key Points
- ▸ GONE: a benchmark for evaluating knowledge unlearning in LLMs using KG facts
- ▸ Neighborhood-Expanded Distribution Shaping (NEDS): a novel unlearning framework leveraging graph connectivity
- ▸ NEDS demonstrates superior performance on unlearning efficacy and locality
- ▸ Addressing knowledge unlearning in LLMs due to safety, privacy, and intellectual property concerns
- ▸ Comprehensive evaluation of knowledge unlearning considering relational, multi-hop, and reasoned knowledge
Merits
Strength in addressing knowledge unlearning limitations
GONE and NEDS address the limitations of existing methods focused on flat sentence-level data, providing a more comprehensive evaluation of knowledge unlearning considering relational, multi-hop, and reasoned knowledge.
Superior performance on unlearning efficacy and locality
Evaluations on LLaMA-3-8B and Mistral-7B demonstrate NEDS's superior performance on unlearning efficacy and locality.
Novel and innovative approach
NEDS's use of graph connectivity to identify anchor correlated neighbors and enforce a precise decision boundary between the forgotten fact and its semantic neighborhood is a novel and innovative approach.
Demerits
Limited evaluation on diverse datasets
The article primarily evaluates NEDS on LLaMA-3-8B and Mistral-7B, and it would be beneficial to assess its performance on a more diverse range of datasets.
Lack of consideration for potential bias in KG facts
The article assumes that the KG facts used in the evaluation are unbiased, but in practice, there may be systematic biases in the data that could impact the performance of NEDS.
Scalability and computational efficiency concerns
The article does not discuss the scalability and computational efficiency of NEDS, which could be a concern for large-scale applications.
Expert Commentary
This article presents a timely and important contribution to the field of natural language processing and AI safety. The authors' approach addresses a critical limitation of existing methods focused on flat sentence-level data and provides a more comprehensive evaluation of knowledge unlearning. The use of graph connectivity to identify anchor correlated neighbors and enforce a precise decision boundary between the forgotten fact and its semantic neighborhood is a novel and innovative approach. While there are some limitations to the article, including limited evaluation on diverse datasets and lack of consideration for potential bias in KG facts, the authors' work is a significant step forward in the development of methods for addressing knowledge unlearning in LLMs.
Recommendations
- ✓ Recommendation 1: Further research should be conducted to evaluate NEDS on a more diverse range of datasets and to investigate its scalability and computational efficiency.
- ✓ Recommendation 2: The authors' approach should be integrated into existing LLMs to improve their performance and safety in various applications.