ToolRosetta: Scalable Tool Access for Open-World Scientific Agents

Shimin Di, Xujie Yuan, Hanghui Guo, Chaoqian Ouyang, Yongxu Liu, Ling Yue, Zhangze Chen, Libin Zheng, Jia Zhu, Shaowu Pan, Jian Yin, Yong Rui, Min-Ling Zhang

arXiv:2603.09290·cs.SE·Published 2026-03-10·Updated 2026-04-10

Large Language Model (LLM)-based agent systems are increasingly being used for scientific discovery, yet their practical capability remains constrained by a narrow and manually curated tool layer. Much scientific computational capability already exists in open-source repositories, software packages and APIs, but these resources remain difficult to standardize, operationalize and invoke reliably. Here we present ToolRosetta, a framework that equips LLM-based agent systems with scalable, open-world computational access by automatically transforming heterogeneous computational programs into validated, callable tools. ToolRosetta integrates repository retrieval, tool standardization, execution testing, iterative repair and security-aware governance. Across 122 GitHub repositories spanning 35 subdisciplines in 6 domains, ToolRosetta standardizes 1,580 callable tools. These tools support an average verified task success rate of 84.0\% across domains and substantially enhance existing agentic AI systems, including OpenClaw, particularly on out-of-distribution tasks beyond fixed curated tool inventories.

TopicsGenerative Models & Discovery

Tagsscientific-discovery

arXiv categoriescs.SE, cs.CE, cs.MA

arXiv abstract page PDF