All Articles

Articles

Academic · 1 min

Attribution Bias in Large Language Models

arXiv:2604.05224v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used to support search and information retrieval, it is critical that they accurately …

Eliza Berman, Bella Chang, Daniel B. Neill, Emily Black
27 views
Academic · 1 min

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

arXiv:2604.05550v1 Announce Type: new Abstract: Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating …

Yu Li, Chenyang Shao, Xinyang Liu, Ruotong Zhao, Peijie Liu, Hongyuan Su, Zhibin Chen, Qinglong Yang, Anjie Xu, Yi Fang, Qingbin Zeng, Tianxing Li, Jingbo Xu, Fengli Xu, Yong Li, Tie-Yan Liu
53 views
Academic · 1 min

Faster Superword Tokenization

arXiv:2604.05192v1 Announce Type: new Abstract: Byte Pair Encoding (BPE) is a widely used tokenization algorithm, whose tokens cannot extend across pre-tokenization boundaries, functionally limiting it …

Craig W. Schmidt, Chris Tanner, Yuval Pinter
32 views
Academic · 1 min

Weight-Informed Self-Explaining Clustering for Mixed-Type Tabular Data

arXiv:2604.05857v1 Announce Type: new Abstract: Clustering mixed-type tabular data is fundamental for exploratory analysis, yet remains challenging due to misaligned numerical-categorical representations, uneven and context-dependent …

Lehao Li, Qiang Huang, Yihao Ang, Bryan Kian Hsiang Low, Anthony K. H. Tung, Xiaokui Xiao
54 views