KANEL: Kolmogorov-Arnold Network Ensemble Learning Enables Early Hit Enrichment in High-Throughput Virtual Screening

Pavel Koptev, Nikita Krainov, Konstantin Malkov, Alexander Tropsha

arXiv:2603.25755·physics.chem-ph·Published 2026-03-25

Machine learning models of chemical bioactivity are increasingly used for prioritizing a small number of compounds in virtual screening libraries for experimental follow-up. In these applications, assessing model accuracy by early hit enrichment such as Positive Predicted Value (PPV) calculated for top N hits (PPV@N) is more appropriate and actionable than traditional global metrics such as AUC. We present KANEL, an ensemble workflow that combines interpretable Kolmogorov-Arnold Networks (KANs) with XGBoost, random forest, and multilayer perceptron models trained on complementary molecular representations (LillyMol descriptors, RDKit-derived descriptors, and Morgan fingerprints).

TopicsProperty Prediction & ADMET

Tagsdrug-discovery molecular-representation

arXiv categoriesphysics.chem-ph, cs.LG, q-bio.QM, stat.ML

arXiv abstract page PDF