Non-Canonical Crosslinks Confound Evolutionary Protein Structure Models

Romain Lacombe

arXiv:2503.17368·q-bio.BM·Published 2025-03-09

Evolution-based protein structure prediction models have achieved breakthrough success in recent years. However, they struggle to generalize beyond evolutionary priors and on sequences lacking rich homologous data. Here we present a novel, out-of-domain benchmark based on sactipeptides, a rare class of ribosomally synthesized and post-translationally modified peptides (RiPPs) characterized by sulfur-to-$α$-carbon thioether bridges creating cross-links between cysteine residues and backbone. We evaluate recent models on predicting conformations compatible with these cross-links bridges for the 10 known sactipeptides with elucidated post-translational modifications. Crucially, the structures of 5 of them have not yet been experimentally resolved. This makes the task a challenging problem for evolution-based models, which we find exhibit limited performance (0.0% to 19.2% GDT-TS on sulfur-to-$α$-carbon distance). Our results point at the need for physics-informed models to sustain progress in biomolecular structure prediction.

TopicsProtein & Biomolecules

Tagsprotein-structure structure-prediction

arXiv categoriesq-bio.BM, cs.AI

arXiv abstract pagePDF