Hallucination, reliability, and the role of generative AI in science

Charles Rathkopf

arXiv:2504.08526·cs.CY·Published 2025-04-11·Updated 2026-01-13

Generative AI increasingly supports scientific inference, from protein structure prediction to weather forecasting. Yet its distinctive failure mode, hallucination, raises epistemic alarm bells. I argue that this failure mode can be addressed by shifting from data-centric to phenomenon-centric assessment. Through case studies of AlphaFold and GenCast, I show how scientific workflows discipline generative models through theory-guided training and confidence-based error screening. These strategies convert hallucination from an unmanageable epistemic threat into bounded risk. When embedded in such workflows, generative models support reliable inference despite opacity, provided they operate in theoretically mature domains.

TopicsGenerative Design & Molecule Optimization, Protein & Biomolecules

Tagsgenerative-model protein-structure structure-prediction

arXiv categoriescs.CY, cs.AI

arXiv abstract page PDF