Latent Retrieval Augmented Generation of Cross-Domain Protein Binders

Zishen Zhang, Xiangzhe Kong, Wenbing Huang, Yang Liu

arXiv:2510.10480·cs.LG·Published 2025-10-12·Updated 2025-10-16

Designing protein binders targeting specific sites, which requires to generate realistic and functional interaction patterns, is a fundamental challenge in drug discovery. Current structure-based generative models are limited in generating nterfaces with sufficient rationality and interpretability. In this paper, we propose Retrieval-Augmented Diffusion for Aligned interface (RADiAnce), a new framework that leverages known interfaces to guide the design of novel binders. By unifying retrieval and generation in a shared contrastive latent space, our model efficiently identifies relevant interfaces for a given binding site and seamlessly integrates them through a conditional latent diffusion generator, enabling cross-domain interface transfer. Extensive exeriments show that RADiAnce significantly outperforms baseline models across multiple metrics, including binding affinity and recovery of geometries and interactions. Additional experimental results validate cross-domain generalization, demonstrating that retrieving interfaces from diverse domains, such as peptides, antibodies, and protein fragments, enhances the generation performance of binders for other domains. Our work establishes a new paradigm for protein binder design that successfully bridges retrieval-based knowledge and generative AI, opening new possibilities for drug discovery.

TopicsGenerative Design & Molecule Optimization, Protein & Biomolecules

Tagsdrug-discovery generative-model protein-ligand

arXiv categoriescs.LG, cs.AI

arXiv abstract pagePDF