Why Deep Jacobian Spectra Separate: Depth-Induced Scaling and Singular-Vector Alignment
arXiv:2602.12384v2 Announce Type: cross Abstract: Understanding why gradient-based training in deep networks exhibits strong implicit bias remains challenging, in part because tractable singular-value dynamics are …