Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

Gang Liu, Michael Sun, Wojciech Matusik, Meng Jiang, Jie Chen

arXiv:2410.04223·cs.LG·Published 2024-10-05

While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inverse design with retrosynthetic planning. Llamole integrates a base LLM with the Graph Diffusion Transformer and Graph Neural Networks for multi-conditional molecular generation and reaction inference within texts, while the LLM, with enhanced molecular understanding, flexibly controls activation among the different graph modules. Additionally, Llamole integrates A* search with LLM-based cost functions for efficient retrosynthetic planning. We create benchmarking datasets and conduct extensive experiments to evaluate Llamole against in-context learning and supervised fine-tuning. Llamole significantly outperforms 14 adapted LLMs across 12 metrics for controllable molecular design and retrosynthetic planning.

TopicsGenerative Design & Molecule Optimization, Large Language Models & Materials, Molecular Representation & Learning

Tagsdrug-discovery gnn molecular-generation

arXiv categoriescs.LG, physics.chem-ph, q-bio.BM

arXiv abstract pagePDF