Grassmannian Geometry and Global Convergence of Variable Projection for Neural Networks
Mathias Dus
arXiv:2601.22897·math.OC·Published 2026-01-30
Training deep neural networks and Physics-Informed Neural Networks (PINNs) often leads to ill-conditioned and stiff optimization problems. A key structural feature of these models is that they are linear in the output-layer parameters and nonlinear in the hiddenlayer parameters, yielding a separable nonlinear least-squares formulation. In this work, we study the classical variable projection (VarPro) method for such problems in the context of deep neural networks. We provide a geometric formulation on the Grassmannian and analyze the structure of critical points and convergence properties of the reduced problem. When the feature map is parametrized by a neural network, we show that these properties persist except in rank-deficient regimes, which we address via a regularized Grassmannian framework. Numerical experiments for regression and PINNs, including an efficient solver for the heat equation, illustrate the practical effectiveness of the approach.
TopicsScientific Machine Learning & PINNs
Tagsphysics-informed-neural-networks pinns
arXiv categoriesmath.OC
arXiv abstract pagePDF