RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization
arXiv:2603.20527v1 Announce Type: new Abstract: Preconditioned adaptive methods have gained significant attention for training deep neural networks, as they capture rich curvature information of the …