Large Language Models in Bioinformatics: A Survey

Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li

arXiv:2503.04490·cs.CL·Published 2025-03-06·Updated 2026-03-01

Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.

TopicsLarge Language Models & Materials, Protein & Biomolecules

Tagsprotein-function structure-prediction

arXiv categoriescs.CL, q-bio.GN

arXiv abstract page PDF