Academic

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

arXiv:2603.18011v1 Announce Type: new Abstract: Many modern AI question-answering systems convert text into vectors and retrieve the closest matches to a user question. While effective for topical similarity, similarity scores alone do not explain why some retrieved text can serve as evidence while other equally similar text cannot. When many candidates receive similar scores, systems may select sentences that are redundant, incomplete, or address different conditions than the question requires. This paper presents a deterministic evidence selection framework for retrieval-augmented question answering. The approach introduces Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE), fixed scoring and redundancy-control procedures that determine evidence admissibility prior to answer generation. Each sentence or record is evaluated independently using explicit signals for semantic relatedness, term coverage, conceptual distinctiveness, and redundancy. No training or fi

V
Victor P. Unda
· · 1 min read · 19 views

arXiv:2603.18011v1 Announce Type: new Abstract: Many modern AI question-answering systems convert text into vectors and retrieve the closest matches to a user question. While effective for topical similarity, similarity scores alone do not explain why some retrieved text can serve as evidence while other equally similar text cannot. When many candidates receive similar scores, systems may select sentences that are redundant, incomplete, or address different conditions than the question requires. This paper presents a deterministic evidence selection framework for retrieval-augmented question answering. The approach introduces Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE), fixed scoring and redundancy-control procedures that determine evidence admissibility prior to answer generation. Each sentence or record is evaluated independently using explicit signals for semantic relatedness, term coverage, conceptual distinctiveness, and redundancy. No training or fine-tuning is required. In the prototype, a unit is accepted only if it explicitly states the fact, rule, or condition required by the task. Units are not merged or expanded. If no unit independently satisfies the requirement, the system returns no answer. This deterministic gating produces compact, auditable evidence sets and establishes a clear boundary between relevant text and usable evidence.

Executive Summary

This article proposes a deterministic evidence selection framework for retrieval-augmented question answering systems. The framework, known as Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE), evaluates each sentence or record independently using explicit signals for semantic relatedness, term coverage, conceptual distinctiveness, and redundancy. The approach requires no training or fine-tuning and produces compact, auditable evidence sets. The deterministic gating mechanism establishes a clear boundary between relevant text and usable evidence, addressing issues of redundancy and incompleteness in retrieved text. The framework has the potential to improve the reliability and transparency of question answering systems, particularly in high-stakes applications such as legal and medical contexts.

Key Points

  • Proposal of a deterministic evidence selection framework for retrieval-augmented question answering systems
  • Use of Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE) for evaluating evidence
  • Independent evaluation of each sentence or record using explicit signals
  • No training or fine-tuning required
  • Production of compact, auditable evidence sets
  • Establishment of a clear boundary between relevant text and usable evidence

Merits

Strength in addressing redundancy and incompleteness

The framework's deterministic gating mechanism addresses the issue of redundant and incomplete evidence in retrieved text, which is a significant limitation of current question answering systems.

Improved reliability and transparency

The framework's ability to produce compact, auditable evidence sets improves the reliability and transparency of question answering systems, particularly in high-stakes applications.

Demerits

Limited generalizability to non-deterministic systems

The framework's deterministic nature may limit its applicability to non-deterministic question answering systems, which may require more flexible and adaptive approaches to evidence selection.

Potential computational complexity

The framework's evaluation of each sentence or record using explicit signals may introduce computational complexity, particularly for large datasets and complex queries.

Expert Commentary

The article presents a novel and promising approach to evidence selection in retrieval-augmented question answering systems. The deterministic framework's focus on explicit signals for semantic relatedness, term coverage, conceptual distinctiveness, and redundancy addresses significant limitations of current systems. However, the framework's limited generalizability to non-deterministic systems and potential computational complexity are concerns that require further investigation. The implications of this work are significant, particularly in high-stakes applications where transparency and accountability are critical. The framework's potential to improve the reliability and transparency of question answering systems makes it a valuable contribution to the field.

Recommendations

  • Further investigation into the framework's generalizability to non-deterministic systems
  • Implementation of the framework in high-stakes applications to evaluate its practical implications

Sources