Academic

Rational Neural Networks have Expressivity Advantages

arXiv:2602.12390v1 Announce Type: cross Abstract: We study neural networks with trainable low-degree rational activation functions and show that they are more expressive and parameter-efficient than modern piecewise-linear and smooth activations such as ELU, LeakyReLU, LogSigmoid, PReLU, ReLU, SELU, CELU, Sigmoid, SiLU, Mish, Softplus, Tanh, Softmin, Softmax, and LogSoftmax. For an error target of $\varepsilon>0$, we establish approximation-theoretic separations: Any network built from standard fixed activations can be uniformly approximated on compact domains by a rational-activation network with only $\mathrm{poly}(\log\log(1/\varepsilon))$ overhead in size, while the converse provably requires $\Omega(\log(1/\varepsilon))$ parameters in the worst case. This exponential gap persists at the level of full networks and extends to gated activations and transformer-style nonlinearities. In practice, rational activations integrate seamlessly into standard architectures and training pipeli

Maosen Tang, Alex Townsend · March 7, 2026 · 1 min read · 15 views

#cs.LG #cs.AI #cs.NA #math.NA

Executive Summary

The article 'Rational Neural Networks have Expressivity Advantages' presents a groundbreaking analysis of neural networks with trainable low-degree rational activation functions. The study demonstrates that these networks are more expressive and parameter-efficient compared to modern piecewise-linear and smooth activations. The authors establish significant approximation-theoretic separations, showing that rational-activation networks can uniformly approximate any network built from standard fixed activations with only a polynomial overhead in size. This advantage persists across various network architectures and training pipelines, suggesting a paradigm shift in neural network design.

Key Points

▸ Rational activation functions offer superior expressivity and parameter efficiency.
▸ Approximation-theoretic separations demonstrate significant advantages over standard activations.
▸ Rational activations integrate seamlessly into existing architectures and training pipelines.

Merits

Theoretical Rigor

The article provides a rigorous mathematical foundation for the advantages of rational activation functions, supported by approximation-theoretic separations.

Practical Applicability

The findings are not merely theoretical but also demonstrate practical benefits, as rational activations can be integrated into standard architectures and training pipelines.

Broad Applicability

The advantages of rational activations extend to various network architectures, including gated activations and transformer-style nonlinearities.

Demerits

Complexity

The mathematical complexity of rational activation functions may pose challenges for implementation and understanding, particularly for practitioners without a strong mathematical background.

Generalization

While the article demonstrates significant advantages, it is important to validate these findings across a broader range of practical applications and datasets to ensure generalization.

Training Stability

The stability and convergence properties of training rational activation networks need further investigation to ensure robustness in real-world scenarios.

Expert Commentary

The article 'Rational Neural Networks have Expressivity Advantages' presents a compelling case for the superiority of rational activation functions over traditional piecewise-linear and smooth activations. The authors' rigorous theoretical analysis, supported by approximation-theoretic separations, provides a solid foundation for the claimed advantages. The practical applicability of these findings is particularly noteworthy, as rational activations can be seamlessly integrated into existing architectures and training pipelines. This integration suggests that the benefits of rational activations are not merely theoretical but can be realized in real-world applications. However, the complexity of rational activation functions may pose challenges for implementation and understanding, particularly for practitioners without a strong mathematical background. Additionally, while the article demonstrates significant advantages, further validation across a broader range of practical applications and datasets is necessary to ensure generalization. The stability and convergence properties of training rational activation networks also require further investigation to ensure robustness in real-world scenarios. Overall, the article represents a significant advancement in the field of neural network design and offers valuable insights for both theoretical and practical research.

Recommendations

✓ Further empirical studies should be conducted to validate the advantages of rational activation functions across a diverse range of applications and datasets.
✓ Researchers should explore the stability and convergence properties of training rational activation networks to ensure robustness in real-world scenarios.

Sources

arXiv - cs.AI

Rational Neural Networks have Expressivity Advantages

AI Commentary

Executive Summary

Key Points

Merits

Theoretical Rigor

Practical Applicability

Broad Applicability

Demerits

Complexity

Generalization

Training Stability

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.