Less is More: Adapting Text Embeddings for Low-Resource Languages with Small Scale Noisy Synthetic Data
arXiv:2603.22290v1 Announce Type: new Abstract: Low-resource languages (LRLs) often lack high-quality, large-scale datasets for training effective text embedding models, hindering their application in tasks like …
Zaruhi Navasardyan, Spartak Bughdaryan, Bagrat Minasyan, Hrant Davtyan
9 views