Distribution-Aware Companding Quantization of Large Language Models
arXiv:2603.00364v1 Announce Type: new Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest …
Quality follows upgrading
All Articles
arXiv:2603.00364v1 Announce Type: new Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest …
arXiv:2603.00369v1 Announce Type: new Abstract: Consider an organization whose users send requests in natural language to an AI system that fulfills them by carrying out …
arXiv:2603.00426v1 Announce Type: new Abstract: The automatic generation of medical reports utilizing Multimodal Large Language Models (MLLMs) frequently encounters challenges related to factual instability, which …
arXiv:2603.00432v1 Announce Type: new Abstract: We introduce a typology-aware diagnostic for multilingual masked language models that tests reliance on word order versus inflectional form. Using …
arXiv:2603.00523v1 Announce Type: new Abstract: Mechanistic circuit discovery is notoriously sensitive to arbitrary analyst choices, especially pruning thresholds and feature dictionaries, often yielding brittle "one-shot" …
arXiv:2603.00573v1 Announce Type: new Abstract: Large language models (LLMs) achieve remarkable performance on diverse downstream and domain-specific tasks via parameter-efficient fine-tuning (PEFT). However, existing PEFT …
arXiv:2603.00582v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated proficiency in Deep Research or Wide Search, their capacity to solve highly complex …
arXiv:2603.00612v1 Announce Type: new Abstract: The rapid growth of biomedical literature and curated databases has made it increasingly difficult for researchers to systematically connect biomarker …
arXiv:2603.00620v1 Announce Type: new Abstract: The growing number of languages considered in multilingual NLP, including new datasets and tasks, poses challenges regarding properly and accurately …
arXiv:2603.00621v1 Announce Type: new Abstract: Research in CDCR remains fragmented due to heterogeneous dataset formats, varying annotation standards, and the predominance of the CDCR definition …
arXiv:2603.00634v1 Announce Type: new Abstract: Multilingual falsehoods threaten information integrity worldwide, yet detection benchmarks remain confined to English or a few high-resource languages, leaving low-resource …
arXiv:2603.00669v1 Announce Type: new Abstract: Sustainability disclosure standards (e.g., GRI, SASB, TCFD, IFRS S2) are comprehensive yet lengthy, terminology-dense, and highly cross-referential, hindering structured analysis …