Academic

An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs

arXiv:2603.04545v1 Announce Type: new Abstract: Efficient inference for graph neural networks (GNNs) on large knowledge graphs (KGs) is essential for many real-world applications. GNN inference queries are computationally expensive and vary in complexity, as each involves a different number of target nodes linked to subgraphs of diverse densities and structures. Existing acceleration methods, such as pruning, quantization, and knowledge distillation, instantiate smaller models but do not adapt them to the structure or semantics of individual queries. They also store models as monolithic files that must be fully loaded, and miss the opportunity to retrieve only the neighboring nodes and corresponding model components that are semantically relevant to the target nodes. These limitations lead to excessive data loading and redundant computation on large KGs. This paper presents KG-WISE, a task-driven inference paradigm for large KGs. KG-WISE decomposes trained GNN models into fine-grained

arXiv:2603.04545v1 Announce Type: new Abstract: Efficient inference for graph neural networks (GNNs) on large knowledge graphs (KGs) is essential for many real-world applications. GNN inference queries are computationally expensive and vary in complexity, as each involves a different number of target nodes linked to subgraphs of diverse densities and structures. Existing acceleration methods, such as pruning, quantization, and knowledge distillation, instantiate smaller models but do not adapt them to the structure or semantics of individual queries. They also store models as monolithic files that must be fully loaded, and miss the opportunity to retrieve only the neighboring nodes and corresponding model components that are semantically relevant to the target nodes. These limitations lead to excessive data loading and redundant computation on large KGs. This paper presents KG-WISE, a task-driven inference paradigm for large KGs. KG-WISE decomposes trained GNN models into fine-grained components that can be partially loaded based on the structure of the queried subgraph. It employs large language models (LLMs) to generate reusable query templates that extract semantically relevant subgraphs for each task, enabling query-aware and compact model instantiation. We evaluate KG-WISE on six large KGs with up to 42 million nodes and 166 million edges. KG-WISE achieves up to 28x faster inference and 98% lower memory usage than state-of-the-art systems while maintaining or improving accuracy across both commercial and open-weight LLMs.

Executive Summary

This article presents KG-WISE, a task-driven inference paradigm for large knowledge graphs, which decomposes trained GNN models into fine-grained components and employs large language models to generate query templates. The system achieves significant speedup and memory reduction while maintaining accuracy on six large graphs. The authors demonstrate the potential of KG-WISE in real-world applications, such as relation prediction and entity disambiguation. The system's ability to adapt to the structure and semantics of individual queries and retrieve only relevant model components is a notable advancement. However, further evaluation on diverse datasets and exploration of the system's limitations are necessary to fully understand its potential and limitations. The article contributes to the ongoing effort to develop efficient and accurate graph neural network models for large-scale knowledge graph inference.

Key Points

  • KG-WISE decomposes trained GNN models into fine-grained components
  • Large language models generate reusable query templates
  • Achieves significant speedup and memory reduction while maintaining accuracy

Merits

Strength in adaptability

KG-WISE's ability to adapt to the structure and semantics of individual queries enables it to retrieve only relevant model components, reducing redundant computation and data loading.

Efficient model instantiation

KG-WISE's task-driven approach allows for partial loading of models based on the queried subgraph, reducing memory usage and achieving faster inference.

Demerits

Limited evaluation

The article evaluates KG-WISE on a limited set of large knowledge graphs, and further evaluation on diverse datasets is necessary to fully understand its potential and limitations.

Dependence on LLMs

KG-WISE relies on large language models to generate query templates, which may not be universally available or accessible, limiting its applicability.

Expert Commentary

The article presents a significant advancement in the field of graph neural networks, addressing the pressing need for efficient and accurate inference methods for large knowledge graphs. While KG-WISE shows promise, further evaluation and exploration of its limitations are necessary to fully understand its potential and applicability. The system's reliance on large language models is a notable limitation, and future work should focus on developing alternatives or improving the accessibility of these models. Additionally, the article highlights the importance of scalable and efficient methods for processing large knowledge graphs, which is essential for many real-world applications.

Recommendations

  • Further evaluation of KG-WISE on diverse datasets and exploration of its limitations are necessary to fully understand its potential and applicability.
  • Development of alternative methods for generating query templates or improving the accessibility of large language models is recommended to address the system's limitations.

Sources