ProtT-Affinity: Sequence-Based Protein-Protein Binding Affinity Prediction Using ProtT5 Embeddings
Hongfu Lou
arXiv:2511.16113·q-bio.QM·Published 2025-11-20
Predicting the binding affinity of protein protein complexes directly from sequence remains a challenging problem, particularly in the absence of reliable structural information. Here I present ProtT Affinity, a sequence only model that combines ProtT5 embeddings with a lightweight Transformer architecture. The model is trained and evaluated on homology filtered subsets of the PDBBind database following a curation protocol consistent with prior structure based work. Across two independent test sets,ProtT Affinity reaches Pearson correlation coefficients of 0.628 and 0.459, respectively.Although its performance does not match the strongest structure based methods, it is competitive with several widely used approaches and provides a practical alternative when structural data are missing or uncertain. The results suggest that large protein language models capture features relevant to binding energetics, and that these features can be exploited to approximate affinity trends at scale.
TopicsProperty Prediction & ADMET, Protein & Biomolecules
Tagsprotein-ligand protein-llm
arXiv categoriesq-bio.QM
arXiv abstract pagePDF