Academic

AIMER: Calibration-Free Task-Agnostic MoE Pruning

arXiv:2603.18492v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) language models increase parameter capacity without proportional per-token compute, but the deployment still requires storing all experts, making expert pruning important for reducing memory and serving overhead. Existing task-agnostic expert pruning methods are typically calibration-dependent: they estimate expert importance from routing or activation statistics on a calibration set, which makes pruning outcomes sensitive to the choice of calibration set and adds substantial preprocessing cost. We introduce AIMER (\textbf{A}bsolute mean over root mean square \textbf{IM}portance for \textbf{E}xpert \textbf{R}anking), a simple calibration-free criterion that yields clear within-layer score separation and distinct expert stratification. Across 7B to 30B MoE language models at 25\% and 50\% pruning ratios over 16 benchmarks, AIMER consistently delivers competitive or stronger overall performance against state-of-the

Zongfang Liu, Shengkun Tang, Yifan Shen, Huan Wang, Xin Yuan · March 20, 2026 · 1 min read · 4 views

#cs.LG

Executive Summary

This article presents AIMER, a calibration-free task-agnostic expert pruning method for Mixture-of-Experts (MoE) language models. AIMER achieves competitive or stronger performance compared to state-of-the-art calibration-based expert pruning baselines, while significantly reducing preprocessing cost and scoring time. The proposed method yields clear within-layer score separation and distinct expert stratification, making it a promising solution for large-scale MoE language models. The results demonstrate the effectiveness of AIMER across various benchmarks, including 7B to 30B parameter sizes and 25% to 50% pruning ratios. The article highlights the importance of expert pruning in reducing memory and serving overhead in MoE language models.

Key Points

▸ AIMER is a calibration-free task-agnostic expert pruning method for MoE language models.
▸ AIMER achieves competitive or stronger performance compared to calibration-based expert pruning baselines.
▸ The proposed method reduces preprocessing cost and scoring time significantly.
▸ AIMER yields clear within-layer score separation and distinct expert stratification.

Merits

Simple and Efficient

AIMER introduces a simple and efficient calibration-free criterion that reduces preprocessing cost and scoring time.

Demerits

Limited Evaluation

The article evaluates AIMER on a limited set of benchmarks, which may not fully capture its performance and generalizability.

Expert Commentary

The article presents a significant contribution to the field of expert pruning in MoE language models. AIMER's ability to achieve competitive or stronger performance without calibration is a notable achievement. However, the limited evaluation of AIMER on a small set of benchmarks raises concerns about its generalizability. Further research is needed to fully assess the potential of AIMER and its applicability to a broader range of use cases. Nevertheless, the article provides a valuable starting point for exploring the possibilities of calibration-free expert pruning in MoE language models.

Recommendations

✓ Future research should focus on evaluating AIMER on a more comprehensive set of benchmarks to assess its performance and generalizability.
✓ The development of AIMER-like methods should be continued to explore the possibilities of calibration-free expert pruning in MoE language models.

Sources

arXiv - cs.LG

AIMER: Calibration-Free Task-Agnostic MoE Pruning

AI Commentary

Executive Summary

Key Points

Merits

Simple and Efficient

Demerits

Limited Evaluation

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.