AutoTool Trains LLMs to Dynamically Select Tools for Superior Reasoning
· Jiaru Zou, Ling Yang, Yunzhe Qi, Sirui Chen, Mengting Ai, Ke Shen, Jingrui He, Mengdi Wang
Researchers have introduced AutoTool, a new training framework that lets AI agents pick and choose tools on the fly while solving problems.…
Researchers have introduced AutoTool, a new training framework that lets AI agents pick and choose tools on the fly while solving problems. The work comes from a team behind a paper on arXiv, and it tackles a real weakness in current large language model (LLM) agents: they typically assume a fixed set of tools. That assumption breaks down when the tools change or new ones appear.
AutoTool works in two phases. First, it uses supervised fine-tuning and reinforcement learning to stabilize the agent's reasoning, keeping its thinking coherent across long chains of steps. Second, it applies a ranking method called KL regularized Plackett Luce Ranking to refine how the agent selects tools consistently over multiple steps. The team also built a 200,000 sample dataset with explicit tool selection rationales, covering over 1,000 tools across 100 tasks in math, science, code, and multimodal reasoning.
To test it, they trained two base models, Qwen3 8B and Qwen2.5 VL 7B, on ten benchmarks. Despite having fewer parameters, AutoTool outperformed other advanced agents and tool integration methods. Average gains were 6.4% in math and science reasoning, 4.5% in search based QA, 7.7% in code generation, and 6.9% in multimodal understanding. More interestingly, the system generalized well, dynamically using unseen tools from evolving toolsets during inference.
That last point is the real news. Most agents are brittle when their environment shifts. AutoTool suggests a way to build agents that aren't locked into a static toolkit, which matters as real world tasks constantly throw new tools at you. If this scales, it could make LLM agents far more practical for open ended problem solving.