Pre-training Graph Neural Networks with Structural Fingerprints for Materials Discovery

Shuyi Jia, Shitij Govil, Manav Ramprasad, Victor Fung

arXiv:2503.01227·cond-mat.mtrl-sci·Published 2025-03-03

In recent years, pre-trained graph neural networks (GNNs) have been developed as general models which can be effectively fine-tuned for various potential downstream tasks in materials science, and have shown significant improvements in accuracy and data efficiency. The most widely used pre-training methods currently involve either supervised training to fit a general force field or self-supervised training by denoising atomic structures equilibrium. Both methods require datasets generated from quantum mechanical calculations, which quickly become intractable when scaling to larger datasets. Here we propose a novel pre-training objective which instead uses cheaply-computed structural fingerprints as targets while maintaining comparable performance across a range of different structural descriptors. Our experiments show this approach can act as a general strategy for pre-training GNNs with application towards large scale foundational models for atomistic data.

TopicsGenerative Design & Molecule Optimization, Large Language Models & Materials, Molecular Representation & Learning

Tagsgnn materials-science mlip

arXiv categoriescond-mat.mtrl-sci, cs.LG

arXiv abstract page PDF