All Articles

Articles

Academic · 1 min

DiffuMask: Diffusion Language Model for Token-level Prompt Pruning

arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, …

Caleb Zheng, Jyotika Singh, Fang Tu, Weiyi Sun, Sujeeth Bharadwaj, Yassine Benajiba, Sujith Ravi, Eli Shlizerman, Dan Roth
48 views
Academic · 1 min

Does a Global Perspective Help Prune Sparse MoEs Elegantly?

arXiv:2604.06542v1 Announce Type: new Abstract: Empirical scaling laws for language models have encouraged the development of ever-larger LLMs, despite their growing computational and memory costs. …

Zeliang Zhang, Nikhil Ghosh, Jiani Liu, Bin Yu, Xiaodong Liu
47 views