All Articles

Articles

Academic · 1 min

When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training

arXiv:2602.21454v1 Announce Type: new Abstract: Recurrent neural networks (RNNs) can be interpreted as discrete-time state-space models, where the state evolution corresponds to an infinite-impulse-response (IIR) …

Alexander Morgan, Ummay Sumaya Khan, Lingjia Liu, Lizhong Zheng
21 views
Academic · 1 min

Muon+: Towards Better Muon via One Additional Normalization Step

arXiv:2602.21545v1 Announce Type: new Abstract: The Muon optimizer has demonstrated promising performance in pre-training large language models through gradient (or momentum) orthogonalization. In this work, …

Ruijie Zhang, Yequan Zhao, Ziyue Liu, Zhengyang Wang, Zheng Zhang
18 views