MP-GCAN: a highly accurate classifier for $α$-helical membrane proteins and $β$-barrel proteins

Kunyang Li, Hongfu Lou, Dinan Peng

arXiv:2507.14269·q-bio.QM·Published 2025-07-18

Membrane protein classification is a fundamental task in structural bioinformatics, critical to understanding protein functions and accelerating drug discovery. In this study, we propose MP-GCAN, a novel graph-based classification model that leverages both spatial and sequential features of proteins. MP-GCAN combines GCN, GAT, and GIN layers to capture hierarchical structural representations from 3D protein graphs, constructed from high-resolution PDB files with $α$-carbon coordinates and residue types. To evaluate performance, we curated a high-quality dataset of 500 membrane and 500 non-membrane proteins, and compared MP-GCAN with two baselines: a structure-confidence-based SGD classifier utilizing AlphaFold's pLDDT scores, and DeepTMHMM, a sequence-based deep learning model. Our experiments demonstrate that MP-GCAN significantly outperforms baselines, achieving an accuracy of 96% and strong F1-scores on both classes. The results highlight the importance of integrating pretrained GNN architectures with domain-specific structural data to enhance membrane protein classification.

TopicsMolecular Representation & Learning, Protein & Biomolecules

Tagsdrug-discovery protein-structure

arXiv categoriesq-bio.QM

arXiv abstract pagePDF