ProtT-Affinity: Sequence-Based Protein-Protein Binding Affinity Prediction Using ProtT5 Embeddings

Hongfu Lou

arXiv:2511.16113·q-bio.QM·Published 2025-11-20

Predicting the binding affinity of protein protein complexes directly from sequence remains a challenging problem, particularly in the absence of reliable structural information. Here I present ProtT Affinity, a sequence only model that combines ProtT5 embeddings with a lightweight Transformer architecture. The model is trained and evaluated on homology filtered subsets of the PDBBind database following a curation protocol consistent with prior structure based work. Across two independent test sets,ProtT Affinity reaches Pearson correlation coefficients of 0.628 and 0.459, respectively.Although its performance does not match the strongest structure based methods, it is competitive with several widely used approaches and provides a practical alternative when structural data are missing or uncertain. The results suggest that large protein language models capture features relevant to binding energetics, and that these features can be exploited to approximate affinity trends at scale.

TopicsProperty Prediction & ADMET, Protein & Biomolecules

Tagsprotein-ligand protein-llm

arXiv categoriesq-bio.QM

arXiv abstract pagePDF