Academic

Academic

Academic · 1 min

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

arXiv:2604.01622v1 Announce Type: new Abstract: Diffusion language models (DLMs) enable parallel, non-autoregressive text generation, yet existing DLM mixture-of-experts (MoE) models inherit token-choice (TC) routing from …

Shuibai Zhang, Caspian Zhuang, Chihan Cui, Zhihan Yang, Fred Zhangzhi Peng, Yanxin Zhang, Haoyue Bai, Zack Jia, Yang Zhou, Guanhua Chen, Ming Liu
7 views
Academic · 1 min

Model Merging via Data-Free Covariance Estimation

arXiv:2604.01329v1 Announce Type: new Abstract: Model merging provides a way of cheaply combining individual models to produce a model that inherits each individual's capabilities. While …

Marawan Gamal Abdel Hameed, Derek Tam, Pascal Jr Tikeng Notsawo, Colin Raffel, Guillaume Rabusseau
13 views
Academic · 1 min

MiCA Learns More Knowledge Than LoRA and Full Fine-Tuning

arXiv:2604.01694v1 Announce Type: new Abstract: Minor Component Adaptation (MiCA) is a novel parameter-efficient fine-tuning method for large language models that focuses on adapting underutilized subspaces …

Sten R\"udiger, Sebastian Raschka
9 views
Academic · 1 min

Residuals-based Offline Reinforcement Learning

arXiv:2604.01378v1 Announce Type: new Abstract: Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real …

Qing Zhu, Xian Yu
11 views
Academic · 1 min

HippoCamp: Benchmarking Contextual Agents on Personal Computers

arXiv:2604.01221v1 Announce Type: new Abstract: We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. Unlike existing agent benchmarks that …

Zhe Yang, Shulin Tian, Kairui Hu, Shuai Liu, Hoang-Nhat Nguyen, Yichi Zhang, Zujin Guo, Mengying Yu, Zinan Zhang, Jingkang Yang, Chen Change Loy, Ziwei Liu
16 views