Recovering Hidden Degrees of Freedom Using Gaussian Processes
Georg Diez, Nele Dethloff, Gerhard Stock
arXiv:2505.18072·cond-mat.soft·Published 2025-05-23·Updated 2025-07-31
Dimensionality reduction represents a crucial step in extracting meaningful insights from Molecular Dynamics (MD) simulations. Conventional approaches, including linear methods such as principal component analysis as well as various autoencoder architectures, typically operate under the assumption of independent and identically distributed data, disregarding the sequential nature of MD simulations. Here, we introduce a physics-informed representation learning framework that leverages Gaussian Processes combined with variational autoencoders to exploit the temporal dependencies inherent in MD data. Time-dependent kernel functions--such as the Matérn kernel--directly impose the temporal correlation structure of the input coordinates onto a low-dimensional space, preserving Markovianity in the reduced representation while faithfully capturing the essential dynamics. Using a three-dimensional toy model, we demonstrate that this approach can successfully identify and separate dynamically distinct states that are geometrically indistinguishable due to hidden degrees of freedom. Applying the framework to a $50\,μ$s-long MD trajectory of T4 lysozyme, we uncover dynamically distinct conformational substates that previous analyses failed to resolve, revealing functional relationships that become apparent only when temporal correlations are taken into account. This time-aware perspective provides a promising framework for understanding complex biomolecular systems, in which conventional collective variables fail to capture the full dynamical picture.
TopicsProcess Modeling & System Identification, Uncertainty Quantification & Bayesian Methods
Tagsgaussian-process molecular-dynamics
arXiv categoriescond-mat.soft, physics.bio-ph, physics.comp-ph, physics.data-an
arXiv abstract pagePDF