Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python

Jakub Adamczyk, Piotr Ludynia

arXiv:2407.13291·cs.SE·Published 2024-07-18·Updated 2025-08-10

In this work, we present scikit-fingerprints, a Python package for computation of molecular fingerprints for applications in chemoinformatics. Our library offers an industry-standard scikit-learn interface, allowing intuitive usage and easy integration with machine learning pipelines. It is also highly optimized, featuring parallel computation that enables efficient processing of large molecular datasets. Currently, scikit-fingerprints stands as the most feature-rich library in the open source Python ecosystem, offering over 30 molecular fingerprints. Our library simplifies chemoinformatics tasks based on molecular fingerprints, including molecular property prediction and virtual screening. It is also flexible, highly efficient, and fully open source.

TopicsMolecular Representation & Learning, Property Prediction & ADMET

Tagsdrug-discovery molecular-representation property-prediction

arXiv categoriescs.SE, cs.LG

arXiv abstract pagePDF