Selective Memory for Artificial Intelligence: Write-Time Gating with Hierarchical Archiving
arXiv:2603.15994v1 Announce Type: new Abstract: Retrieval-augmented generation stores all content indiscriminately, degrading accuracy as noise accumulates. Parametric approaches compress knowledge into weights, precluding selective updates. Neither mirrors biological memory, which gates encoding based on salience and archives rather than deletes superseded information. We introduce write-time gating that filters incoming knowledge objects using composite salience scores (source reputation, novelty, reliability) while maintaining version chains that preserve prior states. Using real LLM evaluation without oracle access to quality labels, write gating achieves 100 percent accuracy versus 13 percent for ungated stores. The critical finding emerges under distractor scaling: at 8:1 distractor ratios, read-time filtering (Self-RAG) collapses to 0 percent while write gating maintains 100 percent, revealing a structural advantage of write-time over read-time curation. Validation on Wikipedia
arXiv:2603.15994v1 Announce Type: new Abstract: Retrieval-augmented generation stores all content indiscriminately, degrading accuracy as noise accumulates. Parametric approaches compress knowledge into weights, precluding selective updates. Neither mirrors biological memory, which gates encoding based on salience and archives rather than deletes superseded information. We introduce write-time gating that filters incoming knowledge objects using composite salience scores (source reputation, novelty, reliability) while maintaining version chains that preserve prior states. Using real LLM evaluation without oracle access to quality labels, write gating achieves 100 percent accuracy versus 13 percent for ungated stores. The critical finding emerges under distractor scaling: at 8:1 distractor ratios, read-time filtering (Self-RAG) collapses to 0 percent while write gating maintains 100 percent, revealing a structural advantage of write-time over read-time curation. Validation on Wikipedia (20 entities), procedurally generated pharmacology data, and 2026 arXiv papers confirms these findings. The gating advantage scales inversely with parametric memory support: +25pp for Wikipedia, +48pp for post-cutoff arXiv, +65pp for procedural data with zero training knowledge. Signal ablation confirms the method does not depend on oracle-correlated metadata. Write gating matches Self-RAG accuracy at one-ninth the query-time cost.
Executive Summary
The article introduces a novel mechanism—write-time gating with hierarchical archiving—to address the degradation of accuracy in retrieval-augmented generation due to indiscriminate content accumulation. By applying composite salience scoring during write-time and preserving version chains, the authors demonstrate a significant improvement in accuracy (100% vs. 13%) without oracle access, particularly under high distractor conditions. The findings are validated across diverse datasets, indicating structural advantages of write-time curation over read-time. The work offers a scalable, cost-efficient alternative to existing parametric memory models.
Key Points
- ▸ Write-time gating improves accuracy by filtering via salience scores during write-time.
- ▸ Hierarchical archiving preserves prior states, enabling selective memory without deletion.
- ▸ Performance advantage scales with dataset complexity and distractor ratios, outperforming read-time filtering under high noise.
Merits
Structural Advantage
The write-time mechanism aligns more closely with biological memory principles, offering a biologically inspired solution to noise accumulation in AI systems.
Scalability and Efficiency
Achieves superior accuracy at a significantly lower query-time cost relative to Self-RAG, making it practical for real-world deployment.
Demerits
Generalization Assumption
While validation occurs across multiple domains, the extent to which these findings generalize to other non-document-based AI tasks or novel domains remains untested.
Expert Commentary
This work represents a meaningful pivot in the evolution of RAG systems. The authors correctly identify a fundamental mismatch between current AI memory paradigms and biological memory mechanisms—specifically, the absence of selective gating and version preservation. The composite salience scoring mechanism is particularly elegant: it introduces a multi-dimensional heuristic (reputation, novelty, reliability) that mirrors human cognitive prioritization, thereby enabling precision in knowledge accumulation. The empirical results are compelling, especially the contrast between write-time and read-time performance under scaling: the 100% accuracy at 8:1 distractor ratios against Self-RAG’s collapse is a critical indicator of architectural robustness. Notably, the signal ablation confirming independence from metadata metadata correlation adds substantial credibility to the findings. This is not merely an incremental improvement—it is a paradigm shift toward memory fidelity in AI. The implications extend beyond technical optimization into the broader discourse on AI accountability, as accurate memory systems become increasingly central to legal, medical, and scientific applications.
Recommendations
- ✓ Adopt write-time gating as a standard enhancement in RAG architectures for high-stakes applications.
- ✓ Explore hybridization with parametric memory models to combine selective gating with computational efficiency.
- ✓ Fund longitudinal studies to evaluate generalization across emerging AI use cases, particularly in clinical or legal domains where precision is paramount.