Abstract
Large language model-based agents increasingly rely on external tools for perception, computation, and actuation. As tool catalogs grow, agents face a combinatorial choice problem that leads to tool misuse, degraded reliability, and increased latency. This paper presents ToolSEE, a real-time tool search engine that ranks and filters candidate tools against the agent's immediate task and context, returning a compact, explainable shortlist for execution. ToolSEE integrates three core capabilities: metadata-centric indexing for capabilities and safety signals, relevance scoring that fuses semantic matching with contextual cues from the agent's dialogue and state, and safety-aware filtering that downranks or removes hazardous, redundant, or low-quality tools. We describe the architecture, the integration pattern with agent loops, and an evaluation across synthetic and application-oriented tasks. Results indicate that ToolSEE reduces unproductive tool exploration, improves response consistency, and lowers the incidence of tool-induced hallucinations while preserving a drop-in integration experience. We conclude with an analysis of limitations and discuss opportunities for deeper safety signals, feedback-driven re-ranking, and standardized provenance reporting.
The source code is available at
github.com/Pro-GenAI/Agent-ToolSEE.
Keywords: Artificial Intelligence, AI Agents, Large Language Models, LLMs, LLM agents, context engineering, decision support