Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python
Jakub Adamczyk, Piotr Ludynia
arXiv:2407.13291·cs.SE·Published 2024-07-18·Updated 2025-08-10
In this work, we present scikit-fingerprints, a Python package for computation of molecular fingerprints for applications in chemoinformatics. Our library offers an industry-standard scikit-learn interface, allowing intuitive usage and easy integration with machine learning pipelines. It is also highly optimized, featuring parallel computation that enables efficient processing of large molecular datasets. Currently, scikit-fingerprints stands as the most feature-rich library in the open source Python ecosystem, offering over 30 molecular fingerprints. Our library simplifies chemoinformatics tasks based on molecular fingerprints, including molecular property prediction and virtual screening. It is also flexible, highly efficient, and fully open source.
TopicsMolecular Representation & Learning, Property Prediction & ADMET
Tagsdrug-discovery molecular-representation property-prediction
arXiv categoriescs.SE, cs.LG
arXiv abstract pagePDF