GoodRegressor: A Hierarchical Inductive Bias for Navigating High-Dimensional Compositional Space

Seong-Hoon Jang

arXiv:2510.18325·cond-mat.mtrl-sci·Published 2025-10-21·Updated 2026-03-27

Interpretable scientific machine learning often trades predictive performance for structural transparency. When physical targets arise from hierarchical and nonlinear descriptor entanglement, weakly interacting white-box models underfit, whereas highly expressive black-box models obscure physical insight. Here I introduce GoodRegressor, a hierarchical depth-controlled symbolic regression framework that systematically assembles nonlinear descriptor interactions through lexicographically-ordered expansion. Despite effective compositional search spaces approaching $\sim 10^{400}$ structures, disciplined depth control enables tractable and reproducible exploration under realistic computational constraints. Across oxygen-ion conductors, NASICONs, and superconducting oxides, as representative high-complexity testbeds, predictive performances match or exceed state-of-the-art black-box models, retaining explicit functional form. Moreover, interaction-depth evolution reveals system-dependent optimal windows, providing an empirical taxonomy of hierarchical complexity in scientific datasets. These results establish hierarchical inductive bias with explicit depth control as a design principle for interpretable artificial intelligence in high-dimensional compositional spaces, and position interaction depth as a structural axis for diagnosing hierarchical complexity in scientific systems.

TopicsScientific Machine Learning & PINNs

Tagsinductive-bias scientific-machine-learning symbolic-regression

arXiv categoriescond-mat.mtrl-sci, physics.comp-ph

arXiv abstract page PDF