New Statistical Framework for Extreme Error Probability in High-Stakes Domains for Reliable Machine Learning

Umberto Michelucci, Francesca Venturini

arXiv:2503.24262·cs.LG·Published 2025-03-31

Machine learning is vital in high-stakes domains, yet conventional validation methods rely on averaging metrics like mean squared error (MSE) or mean absolute error (MAE), which fail to quantify extreme errors. Worst-case prediction failures can have substantial consequences, but current frameworks lack statistical foundations for assessing their probability. In this work a new statistical framework, based on Extreme Value Theory (EVT), is presented that provides a rigorous approach to estimating worst-case failures. Applying EVT to synthetic and real-world datasets, this method is shown to enable robust estimation of catastrophic failure probabilities, overcoming the fundamental limitations of standard cross-validation. This work establishes EVT as a fundamental tool for assessing model reliability, ensuring safer AI deployment in new technologies where uncertainty quantification is central to decision-making or scientific analysis.

TopicsGenerative Models & Discovery

Tagsuncertainty-quantification

arXiv categoriescs.LG, cs.AI, stat.ME, stat.ML

arXiv abstract pagePDF