Learning Retrieval Models with Sparse Autoencoders
arXiv:2603.13277v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) provide a powerful mechanism for decomposing the dense representations produced by Large Language Models (LLMs) into interpretable latent features. We posit that SAEs constitute a natural foundation for Learned Sparse Retrieval (LSR), whose objective is to encode queries and documents into high-dimensional sparse representations optimized for efficient retrieval. In contrast to existing LSR approaches that project input sequences into the vocabulary space, SAE-based representations offer the potential to produce more semantically structured, expressive, and language-agnostic features. Building on this insight, we introduce SPLARE, a method to train SAE-based LSR models. Our experiments, relying on recently released open-source SAEs, demonstrate that this technique consistently outperforms vocabulary-based LSR in multilingual and out-of-domain settings. SPLARE-7B, a multilingual retrieval model capable of produc
arXiv:2603.13277v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) provide a powerful mechanism for decomposing the dense representations produced by Large Language Models (LLMs) into interpretable latent features. We posit that SAEs constitute a natural foundation for Learned Sparse Retrieval (LSR), whose objective is to encode queries and documents into high-dimensional sparse representations optimized for efficient retrieval. In contrast to existing LSR approaches that project input sequences into the vocabulary space, SAE-based representations offer the potential to produce more semantically structured, expressive, and language-agnostic features. Building on this insight, we introduce SPLARE, a method to train SAE-based LSR models. Our experiments, relying on recently released open-source SAEs, demonstrate that this technique consistently outperforms vocabulary-based LSR in multilingual and out-of-domain settings. SPLARE-7B, a multilingual retrieval model capable of producing generalizable sparse latent embeddings for a wide range of languages and domains, achieves top results on MMTEB's multilingual and English retrieval tasks. We also developed a 2B-parameter variant with a significantly lighter footprint.
Executive Summary
This article introduces SPLARE, a method for training sparse autoencoder-based learned sparse retrieval models. The approach leverages sparse autoencoders to produce semantically structured and expressive features, outperforming traditional vocabulary-based methods in multilingual and out-of-domain settings. Experiments demonstrate the effectiveness of SPLARE, with a 7B-parameter model achieving top results on multilingual and English retrieval tasks.
Key Points
- ▸ Introduction of SPLARE, a method for training SAE-based LSR models
- ▸ Use of sparse autoencoders to produce semantically structured features
- ▸ Outperformance of traditional vocabulary-based LSR methods in multilingual and out-of-domain settings
Merits
Improved Retrieval Performance
SPLARE demonstrates improved retrieval performance in multilingual and out-of-domain settings, making it a valuable approach for information retrieval tasks.
Demerits
Computational Complexity
The use of sparse autoencoders may increase computational complexity, potentially limiting the applicability of SPLARE in resource-constrained environments.
Expert Commentary
The introduction of SPLARE represents a significant advancement in the field of learned sparse retrieval. By leveraging sparse autoencoders, SPLARE is able to produce more semantically structured and expressive features, leading to improved retrieval performance in multilingual and out-of-domain settings. However, further research is needed to fully explore the potential of SPLARE and address potential limitations, such as computational complexity. Nonetheless, SPLARE has significant implications for information retrieval tasks and has the potential to improve the effectiveness and efficiency of real-world applications.
Recommendations
- ✓ Further research on the application of SPLARE in various domains and settings
- ✓ Investigation into methods for reducing computational complexity and improving the scalability of SPLARE