Towards a Universal Foundation Model for Protein Dynamics: A Multi-Chain Tree-Structured Framework with Transformer Propagators

Jinzhen Zhu

arXiv:2502.05909·physics.atom-ph·Published 2025-02-09·Updated 2026-04-15

Simulating large-scale protein dynamics using traditional all-atom molecular dynamics (MD) remains computationally prohibitive. We present a unified, universal framework for coarse-grained molecular dynamics (CG-MD) that achieves high-fidelity structural reconstruction and generalizes across diverse protein systems. Central to our approach is a hierarchical, tree-structured protein representation (TSCG) that maps Cartesian coordinates into a minimal set of interpretable collective variables. We extend this representation to accommodate multi-chain assemblies, demonstrating sub-angstrom precision in reconstructing full-atom structures from coarse-grained nodes. To model temporal evolution, we formulate protein dynamics as stochastic differential equations (SDEs), utilizing a Transformer-based architecture as a universal propagator. By representing collective variables as language-like sequences, our model transcends the limitations of protein-specific networks, generalizing to arbitrary sequence lengths and multi-chain configurations. The framework achieves an acceleration of over 10,000 to 20,000 times compared to traditional MD, generating microsecond-long trajectories within minutes. Our results show that the generated trajectories maintain statistical consistency with all-atom MD in RMSD profiles and structural ensembles. This universal model provides a salable solution for high-throughput protein simulation, offering a significant leap toward a foundation model for molecular dynamics.

TopicsProtein & Biomolecules, Quantum Chemistry & Force Fields

Tagsmolecular-dynamics

arXiv categoriesphysics.atom-ph, physics.chem-ph, physics.comp-ph

arXiv abstract pagePDF