Large Language Models in Bioinformatics: A Survey
Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li
arXiv:2503.04490·cs.CL·Published 2025-03-06·Updated 2026-03-01
Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.
TopicsLarge Language Models & Materials, Protein & Biomolecules
Tagsprotein-function structure-prediction
arXiv categoriescs.CL, q-bio.GN
arXiv abstract pagePDF