Learning data efficient coarse-grained molecular dynamics from forces and noise
Aleksander E. P. Durumeric, Yaoyi Chen, Frank Noé, Cecilia Clementi
arXiv:2407.01286·physics.bio-ph·Published 2024-07-01
Machine-learned coarse-grained (MLCG) molecular dynamics is a promising option for modeling biomolecules. However, MLCG models currently require large amounts of data from reference atomistic molecular dynamics or substantial computation for training. Denoising score matching -- the technology behind the widely popular diffusion models -- has simultaneously emerged as a machine-learning framework for creating samples from noise. Models in the first category are often trained using atomistic forces, while those in the second category extract the data distribution by reverting noise-based corruption. We unify these approaches to improve the training of MLCG force-fields, reducing data requirements by a factor of 100 while maintaining advantages typical to force-based parameterization. The methods are demonstrated on proteins Trp-Cage and NTL9 and published as open-source code.
TopicsGenerative Design & Molecule Optimization, Quantum Chemistry & Force Fields
Tagsdiffusion-model molecular-dynamics
arXiv categoriesphysics.bio-ph, physics.chem-ph
arXiv abstract pagePDF