NEPMaker: Active learning of neuroevolution machine learning potential for large cells

Junjie Wang, Shuning Pan, Haoting Zhang, Qiuhan Jia, Chi Ding, Zheyong Fan, Jian Sun

arXiv:2604.13848·physics.comp-ph·Published 2026-04-15

Machine learning potentials (MLPs) achieve near first-principles accuracy but often fail for atomic environments outside the training distribution. Active learning can mitigate this limitation; however, its application to large-scale simulations is hindered by the prohibitive cost of labeling entire configurations. Here, we develop a D-optimality-driven active learning framework for the neuroevolution potential (NEP) implemented within the GPUMD package, named NEPMaker. Extrapolative atomic environments are identified on-the-fly and embedded into locally periodic structures, where boundary atoms are optimized to remain close to the training distribution. This strategy enables large-scale simulations to directly contribute to dataset construction, significantly reducing extrapolation errors while improving model robustness and transferability. The proposed framework provides a scalable route for constructing reliable machine learning potentials in complex materials systems, including those involving defects, interfaces, and phase transitions.

TopicsQuantum Chemistry & Force Fields

Tagsab-initio active-learning mlip phase-transition

arXiv categoriesphysics.comp-ph

arXiv abstract pagePDF