DeepProtein: Deep Learning Library and Benchmark for Protein Sequence Learning

Jiaqing Xie, Tianfan Fu

arXiv:2410.02023·cs.LG·Published 2024-10-02·Updated 2025-04-06

Deep learning has deeply influenced protein science, enabling breakthroughs in predicting protein properties, higher-order structures, and molecular interactions. This paper introduces DeepProtein, a comprehensive and user-friendly deep learning library tailored for protein-related tasks. It enables researchers to seamlessly address protein data with cutting-edge deep learning models. To assess model performance, we establish a benchmark evaluating different deep learning architectures across multiple protein-related tasks, including protein function prediction, subcellular localization prediction, protein-protein interaction prediction, and protein structure prediction. Furthermore, we introduce DeepProt-T5, a series of fine-tuned Prot-T5-based models that achieve state-of-the-art performance on four benchmark tasks, while demonstrating competitive results on six of others. Comprehensive documentation and tutorials are available which could ensure accessibility and support reproducibility. Built upon the widely used drug discovery library DeepPurpose, DeepProtein is publicly available at https://github.com/jiaqingxie/DeepProtein.

TopicsProtein & Biomolecules

Tagsdrug-discovery protein-function protein-structure structure-prediction

arXiv categoriescs.LG, cs.AI, q-bio.QM

arXiv abstract pagePDF