Data-driven complete basis set limit estimates from a minimal auxiliary basis
Nicolas Grimblat, Gabriel Klassen, Guido Falk von Rudorff
arXiv:2605.15927·physics.chem-ph·Published 2026-05-15
Quantum chemistry calculations are often performed using atom-centered basis sets which are chosen to balance accuracy and cost. While they are systematically improvable, the total energy converges slowly with basis set size towards the complete basis set (CBS) limit. Common extrapolation methods require several intermediate-quality calculations to afford an estimate of the CBS energy. We propose combining a pairwise interaction model with a minimal complementary auxiliary basis set (CABS) baseline to estimate the CBS energy from a single quantum chemistry calculation in a minimal basis set via Kernel-Ridge-Regression (KRR), which is more efficient than both direct and $Δ$-machine learning. We show that KRR on standard molecular representations can be improved by approximating atom-wise local kernels using Chebyshev polynomials which allows us to train KRR models efficiently on moderate compute resources, further enabling a data-driven approach towards CBS combining physical baselines capturing leading order effects with data-efficient machine learning models.
TopicsQuantum Chemistry & Force Fields
Tagsmolecular-representation quantum-chemistry
arXiv categoriesphysics.chem-ph
arXiv abstract pagePDF