Enhancing composition-based materials property prediction by cross-modal knowledge transfer

Ivan Rubtsov, Ivan Dudakov, Yuri Kuratov, Vadim Korolev

arXiv:2511.03371·cond-mat.mtrl-sci·Published 2025-11-05

Crystal graph neural networks are widely applicable in modeling experimentally synthesized compounds and hypothetical materials with unknown synthesizability. In contrast, structure-agnostic predictive algorithms allow exploring previously inaccessible domains of chemical space. Here we present a universal approach for enhancing composition-based materials property prediction by means of cross-modal knowledge transfer. Two formulations are proposed: implicit transfer involves pretraining chemical language models on multimodal embeddings, whereas explicit transfer suggests generating crystal structures and implementing structure-aware predictors. The proposed approaches were benchmarked on LLM4Mat-Bench and MatBench tasks, achieving state-of-the-art performance in 25 out of 32 cases. In addition, we demonstrated how another modeling aspect of chemical language models - interpretability - benefits from applying a game-theoretic approach, which is able to incorporate high-order feature interactions.

TopicsMolecular Representation & Learning

Tagschemical-llm chemical-space gnn property-prediction

arXiv categoriescond-mat.mtrl-sci, physics.comp-ph

arXiv abstract pagePDF