Category

Academic

Academic · 1 min

NuMuon: Nuclear-Norm-Constrained Muon for Compressible LLM Training

arXiv:2603.03597v1 Announce Type: new Abstract: The rapid progress of large language models (LLMs) is increasingly constrained by memory and deployment costs, motivating compression methods for …

Hadi Mohaghegh Dolatabadi, Thalaiyasingam Ajanthan, Sameera Ramasinghe, Chamin P Hewa Koneputugodage, Shamane Siriwardhana, Violetta Shevchenko, Karol Pajak, James Snewin, Gil Avraham, Alexander Long
12 views
Academic · 1 min

Riemannian Optimization in Modular Systems

arXiv:2603.03610v1 Announce Type: new Abstract: Understanding how systems built out of modular components can be jointly optimized is an important problem in biology, engineering, and …

Christian Pehle, Jean-Jacques Slotine
13 views
Academic · 1 min

Why Are Linear RNNs More Parallelizable?

arXiv:2603.03612v1 Announce Type: new Abstract: The community is increasingly exploring linear RNNs (LRNNs) as language models, motivated by their expressive power and parallelizability. While prior …

William Merrill, Hongjian Jiang, Yanhong Li, Ashish Sabharwal
20 views