Large Language Model Assisted Discovery of Optimal Dopants for Enhanced Thermoelectric Performance in CoSb$_3$ Based Skutterudites
Yagnik Bandyopadhyay, Dylan Noel Serrao, Houlong L. Zhuang
arXiv:2604.06048·cond-mat.mtrl-sci·Published 2026-04-07
We present a data-driven approach for accelerating the discovery of high-performance CoSb$_3$-based skutterudites by curating a comprehensive dataset of compositions with various filler elements from over 300 research articles. Leveraging large language models (LLMs), we extract and embed compositional representations, which are then used to train a regression head for predicting thermoelectric figure of merit. Compared to traditional deep neural networks relying on elemental descriptors such as atomic radii, our LLM-based model achieves significantly lower mean-squared error losses. We further employ the trained model to propose novel filler compositions with promising thermoelectric properties. Finally, we support these predicted candidates through density functional theory and molecular dynamics calculations to assess their electrical and thermal conductivity. This data-driven approach demonstrates the potential of combining natural language processing, machine learning, and quantum simulations for thermoelectric materials design.
TopicsLarge Language Models & Materials, Quantum Chemistry & Force Fields
Tagsdft molecular-dynamics thermal-properties
arXiv categoriescond-mat.mtrl-sci
arXiv abstract pagePDF