Academic

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

arXiv:2603.24002v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to $\mathcal{O}(1)$, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of $\mathcal{O}(1/\varepsilon^2)$, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically,

Z
Zhangyong Liang, Ji Zhang
· · 1 min read · 2 views

arXiv:2603.24002v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to $\mathcal{O}(1)$, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of $\mathcal{O}(1/\varepsilon^2)$, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the $\mathcal{O}(1/\varepsilon^2)$ variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from $\mathcal{O}(P)$ to $\mathcal{O}(r)$ while maintaining an $\mathcal{O}(1)$ optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.

Executive Summary

The article proposes the Stochastic Dimension-Free Zeroth-Order Estimator (SDZE), a novel framework that overcomes the challenges of training Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations. SDZE achieves dimension-independent complexity in both space and memory by leveraging Common Random Numbers Synchronization (CRNS) and implicit matrix-free subspace projection. This framework enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, significantly improving speed and memory efficiency. The empirical results demonstrate the efficacy of SDZE in tackling complex PDE problems.

Key Points

  • SDZE overcomes the spatial derivative complexity and memory overhead of backpropagation in PINNs.
  • The framework leverages CRNS and implicit matrix-free subspace projection to achieve dimension-independent complexity.
  • SDZE enables the training of high-dimensional PINNs on a single GPU, improving speed and memory efficiency.

Merits

Strength in addressing complexity

SDZE effectively tackles the spatial derivative complexity and memory overhead of backpropagation, enabling the training of high-dimensional PINNs on a single GPU.

Scalability and efficiency

The framework significantly improves speed and memory efficiency, making it a promising solution for complex PDE problems.

Demerits

Variance explosion mitigation

The article does not thoroughly discuss the potential issues with variance explosion and how SDZE mitigates it, which may be a concern for users.

Generalizability

The empirical results are limited to a specific NVIDIA A100 GPU, and it is unclear whether SDZE will perform equally well on other hardware configurations.

Expert Commentary

The article presents a novel and innovative framework for training high-dimensional PINNs. The proposed SDZE framework effectively addresses the challenges of spatial derivative complexity and memory overhead, enabling the training of complex PDE problems on a single GPU. However, further investigation is needed to thoroughly understand the potential issues with variance explosion and the generalizability of SDZE. The article's findings have significant practical and policy implications and are a promising step forward in the development of deep learning methods for complex PDE problems.

Recommendations

  • Future work should focus on thoroughly investigating the potential issues with variance explosion and the generalizability of SDZE.
  • The article's findings should be further explored in the context of other deep learning applications, such as computer vision and natural language processing.

Sources

Original: arXiv - cs.LG