Academic

Academic

Academic · 1 min

Bridging Domains through Subspace-Aware Model Merging

arXiv:2603.05768v1 Announce Type: new Abstract: Model merging integrates multiple task-specific models into a single consolidated one. Recent research has made progress in improving merging performance …

Levy Chaves, Chao Zhou, Rebekka Burkholz, Eduardo Valle, Sandra Avila
12 views
Academic · 1 min

Sparse Crosscoders for diffing MoEs and Dense models

arXiv:2603.05805v1 Announce Type: new Abstract: Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to …

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati
3 views
Academic · 1 min

MoE Lens -- An Expert Is All You Need

arXiv:2603.05806v1 Announce Type: new Abstract: Mixture of Experts (MoE) models enable parameter-efficient scaling through sparse expert activations, yet optimizing their inference and memory costs remains …

Marmik Chaudhari, Idhant Gulati, Nishkal Hundia, Pranav Karra, Shivam Raval
3 views
Academic · 1 min

Stochastic Event Prediction via Temporal Motif Transitions

arXiv:2603.05874v1 Announce Type: new Abstract: Networks of timestamped interactions arise across social, financial, and biological domains, where forecasting future events requires modeling both evolving topology …

\.Ibrahim Bahad{\i}r Altun, Ahmet Erdem Sar{\i}y\"uce
56 views