Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials

Abhinaba Basu, Pavan Chakraborty

arXiv:2603.12183·cond-mat.mtrl-sci·Published 2026-03-12·Updated 2026-03-13

Machine-learned interatomic potentials (MLIPs) are deployed for high-throughput materials screening without formal reliability guarantees. We show that a single MLIP used as a stability filter misses 93% of density functional theory (DFT)-stable materials (recall 0.07) on a 25,000-material benchmark. Proof-Carrying Materials (PCM) closes this gap through three stages: adversarial falsification across compositional space, bootstrap envelope refinement with 95% confidence intervals, and Lean 4 formal certification. Auditing CHGNet, TensorNet and MACE reveals architecture-specific blind spots with near-zero pairwise error correlations (r <= 0.13; n = 5,000), confirmed by independent Quantum ESPRESSO validation (20/20 converged; median DFT/CHGNet force ratio 12x). A risk model trained on PCM-discovered features predicts failures on unseen materials (AUC-ROC = 0.938 +/- 0.004) and transfers across architectures (cross-MLIP AUC-ROC ~ 0.70; feature importance r = 0.877). In a thermoelectric screening case study, PCM-audited protocols discover 62 additional stable materials missed by single-MLIP screening - a 25% improvement in discovery yield.

TopicsMachine-Learned Potentials for Sulfides and Minerals

Tagschgnet density-functional-theory mace mlip

arXiv categoriescond-mat.mtrl-sci, cs.AI, cs.LG, physics.comp-ph

arXiv abstract page PDF